Compute a hash from a stream of unknown length in C#
MD5, like other hash functions, does not require two passes.
To start:
HashAlgorithm hasher = ..;
hasher.Initialize();
As each block of data arrives:
byte[] buffer = ..;
int bytesReceived = ..;
hasher.TransformBlock(buffer, 0, bytesReceived, null, 0);
To finish and retrieve the hash:
hasher.TransformFinalBlock(new byte[0], 0, 0);
byte[] hash = hasher.Hash;
This pattern works for any type derived from HashAlgorithm
, including MD5CryptoServiceProvider
and SHA1Managed
.
HashAlgorithm
also defines a method ComputeHash
which takes a Stream
object; however, this method will block the thread until the stream is consumed. Using the TransformBlock
approach allows an "asynchronous hash" that is computed as data arrives without using up a thread.
The System.Security.Cryptography.MD5
class contains a ComputeHash
method that takes either a byte[]
or Stream
. Check out the documentation.
Further to @peter-mourfield 's answer, here is the code that uses ComputeHash()
:
private static string CalculateMd5(string filePathName) {
using (var stream = File.OpenRead(filePathName))
using (var md5 = MD5.Create()) {
var hash = md5.ComputeHash(stream);
var base64String = Convert.ToBase64String(hash);
return base64String;
}
}
Since both the stream as well as MD5 implement IDisposible, you need to use using(...){...}
The method in the code example returns the same string that is used for the MD5 checksum in Azure Blob Storage.