Checksum

Definition


A unique alphanumeric value that represents the bitstream of an individual computer File or set of files.

Source: Dictionary of Archives Terminology. Society of American Archivists.

Introduction


Checksums have many information security applications. Archivists and data curators use checksums to detect data corruption errors and verify the integrity of data. In the Dalhousie Libraries' digital repository, checksums have three main purposes:

Transfer validation

The University Archives uses checksums to verify that files have been correctly received from a donor or transferring department and then transferred successfully to archival storage. The Archives must be able to compare checksums generated before and after a file transfer to determine whether each file has maintained the same value through the transfer process.

If the checksums do not match, there is a very high probability the data was accidentally altered or corrupted during the transfer. Secure destruction of digital records cannot be authorized until every file in a transfer is validated.

Fixity

The University Archives can also use checksums to periodically check files in archival storage to detect alterations or file corruption. This is known as a “fixity check.”

Access

The University Archives can provide checksums to authorized users so they know that the files were not accidentally altered or corrupted during retrieval and delivery.

Checksum algorithms


Cryptographic hash functions are mathematical algorithms that generate checksums. Digital tools used by archivists incorporate cryptographic hash functions. Common algorithms used in digital archiving include MD5, SHA-1, SHA-256, and SHA-512.

MD5 has many known vulnerabilities and is no longer suitable for cyrptographic purposes. SHA-512 is recommended as the default cryptographic hash function.

BagIt File Packaging Format


The BagIt File Packaging Format is a widely adopted set of conventions for storage and transfer of digital content. Packages are known as Bags.

According to the BagIt format, Bags must have a “payload manifest” file that provides a complete listing of each file in the Bag along with a corresponding checksum to support transfer validation and fixity checks.

Related terms


Bag

BagIt File Packaging Format

File

References


Digital Preservation Coalition. Digital Preservation Handbook. Fixity and Checksums. See: https://www.dpconline.org/handbook/technical-solutions-and-tools/fixity-and-checksums.