checksum and md5, not the same thing?

Ensembl is using the unix 'sum' utilty to calcualte the CHECKSUM.gz file.

Here's more info about the program : http://en.wikipedia.org/wiki/Sum_%28Unix%29

To see if your download is correct, try:

sum Macaca_mulatta.MMUL_1.70.dna.chromosome.1.fa.gz

NOTE: It happened before that Ensembl did not update their CHECKSUM file so it can always happen that the download is correct but the CHECKSUM.gz file is incorrect.


They are not the same thing. MD5 is a checksum but there are other checksum algorithms that are not MD5, such as SHA, CRC etc.

Generally a checksum is a function that takes an input that's larger in size than its output and (it better) produces greatly different outputs even if one bit in the input is changed.

The output you're looking at consists of two 5-digit decimal numbers, so it's likely your checksum algorithm is CRC32. The unix sum command may be used to calculate/verify it.

Tags:

Md5