C
C#2y ago
Surihia

❔ Increase Hash computing speed.

So I am trying to make a simple app for my use case that computes SHA256 of two files. right now I have this following code which works but is extremely slow:
static void Main(string[] args)
{
var inFile_1 = args[0];
var inFile_2 = args[1];

Console.WriteLine("Computing hash....");
using (FileStream fs = new FileStream(inFile_1, FileMode.Open, FileAccess.Read))
{
using (SHA256 mySHA256 = SHA256.Create())
{
fs.Position = 0;
byte[] HashBuffer = new byte[4096];

HashBuffer = mySHA256.ComputeHash(fs);

Console.WriteLine(BitConverter.ToString(HashBuffer).Replace("-", ""));
Console.ReadLine();
}
}
}
static void Main(string[] args)
{
var inFile_1 = args[0];
var inFile_2 = args[1];

Console.WriteLine("Computing hash....");
using (FileStream fs = new FileStream(inFile_1, FileMode.Open, FileAccess.Read))
{
using (SHA256 mySHA256 = SHA256.Create())
{
fs.Position = 0;
byte[] HashBuffer = new byte[4096];

HashBuffer = mySHA256.ComputeHash(fs);

Console.WriteLine(BitConverter.ToString(HashBuffer).Replace("-", ""));
Console.ReadLine();
}
}
}
I have only set it to get the hash for one file. my issue is the speed of the hash computing is extremely slow. is there any way by which I can increase the speed ?
18 Replies
cap5lut
cap5lut2y ago
i dont see much u could improve: fs.Position = 0; can be removed as u are always starting at pos 0 when reading the file here u are creating a 4kb array, just to replace it with the array from the hasher
byte[] HashBuffer = new byte[4096];
HashBuffer = mySHA256.ComputeHash(fs);
byte[] HashBuffer = new byte[4096];
HashBuffer = mySHA256.ComputeHash(fs);
so byte[] hash = mySHA256.ComputeHash(fs); would be enough besides that i dont think you can improve anything else that easily and the most impact comes probably from the file size and generally reading from the disk
Surihia
SurihiaOP2y ago
Yeah getting a large file's hash is the part tht is quite slow. I want the speed to atleast be on par with 7zip's CRC-SHA>SHA256 option and the current code is slower than tht.
cap5lut
cap5lut2y ago
the only thing i can think of, would be to do it async and in chunks. basically read a chunk of the file, fire off reading the next chunk async and while that task is running compute the hash of the already read chunk not sure if u can do that with that SHA256 implementation or if u have to roll ur own basically start computing the hash while continuing to read the file
Surihia
SurihiaOP2y ago
you mean like doing it in a while loop with a custom bytes to read value per chunk ? I was thinking of using a buffered stream and set a large buffer length instead of being relegated to just 4kb.
cap5lut
cap5lut2y ago
that wont work, because it still will either read or compute the hash when u look at the implementation:
public byte[] ComputeHash(Stream inputStream)
{
ObjectDisposedException.ThrowIf(_disposed, this);

// Use ArrayPool.Shared instead of CryptoPool because the array is passed out.
byte[] buffer = ArrayPool<byte>.Shared.Rent(4096);

int bytesRead;
int clearLimit = 0;

while ((bytesRead = inputStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (bytesRead > clearLimit)
{
clearLimit = bytesRead;
}

HashCore(buffer, 0, bytesRead);
}

CryptographicOperations.ZeroMemory(buffer.AsSpan(0, clearLimit));
ArrayPool<byte>.Shared.Return(buffer, clearArray: false);
return CaptureHashCodeAndReinitialize();
}
public byte[] ComputeHash(Stream inputStream)
{
ObjectDisposedException.ThrowIf(_disposed, this);

// Use ArrayPool.Shared instead of CryptoPool because the array is passed out.
byte[] buffer = ArrayPool<byte>.Shared.Rent(4096);

int bytesRead;
int clearLimit = 0;

while ((bytesRead = inputStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (bytesRead > clearLimit)
{
clearLimit = bytesRead;
}

HashCore(buffer, 0, bytesRead);
}

CryptographicOperations.ZeroMemory(buffer.AsSpan(0, clearLimit));
ArrayPool<byte>.Shared.Return(buffer, clearArray: false);
return CaptureHashCodeAndReinitialize();
}
but u want to do both at the same time basically in the while loop u would not read directly from the stream, but from some kind of list/queue that contains the chunks, while another thread/an async running task is reading from the stream and filling the list/queue that approach would have a little overhead because of synchronizing between the 2 threads and would only save some time while the HashCore method is computing that chunk
Surihia
SurihiaOP2y ago
getting lot of errors with this code. does this work on a framework project ?
cap5lut
cap5lut2y ago
thats the dotnet 7 version of SHA256.ComputeHash, but u would not want to use that code anyway, as it is synchronously reading a chunk from the stream and then computing the hash not sure whats all available on net 4.6 framework well the main issue is still speeding up the hashing + file reading u would want something like this for doing both in parallel:
public async Task<byte[]> ComputeHashAsync(Stream inputStream)
{
byte[] result = new byte[32];
Task hashingTask = Task.CompletedTask;

while (true)
{
var buffer = new byte[4096];
var read = await inputStream.ReadAsync(buffer, 0, buffer.Length);
if (read == 0) break;
hashingTask = hashingTask.ContinueWith(t => {
DoHashing(buffer, 0, read, result);
});
}
await hashingTask;
return result;
}
public async Task<byte[]> ComputeHashAsync(Stream inputStream)
{
byte[] result = new byte[32];
Task hashingTask = Task.CompletedTask;

while (true)
{
var buffer = new byte[4096];
var read = await inputStream.ReadAsync(buffer, 0, buffer.Length);
if (read == 0) break;
hashingTask = hashingTask.ContinueWith(t => {
DoHashing(buffer, 0, read, result);
});
}
await hashingTask;
return result;
}
iirc ArrayPool isnt available in framework, so i simply create a new byte array for each chunk DoHashing() would would be like the HashCore() method, just that i passed the result array as parameter as well i also omitted the cancellation token stuff to show the important stuff note also that this totally disregards memory consumption, if u want to limited it to eg, have only 5 chunks in memory at the same time u could use a SemaphoreSlim for that
Surihia
SurihiaOP2y ago
is DoHashing supposed to be a method ?
cap5lut
cap5lut2y ago
one u would have to implement ;p its just a place holder to show where the hashing belongs
Surihia
SurihiaOP2y ago
oh so I would have to put this method in the DoHashing method then ?
using (SHA256 mySHA256 = SHA256.Create())
{
var hashBuffer = mySHA256.ComputeHash(fs);
hashString = BitConverter.ToString(hashBuffer).Replace("-", "");
Console.WriteLine(hashString);
}
using (SHA256 mySHA256 = SHA256.Create())
{
var hashBuffer = mySHA256.ComputeHash(fs);
hashString = BitConverter.ToString(hashBuffer).Replace("-", "");
Console.WriteLine(hashString);
}
cap5lut
cap5lut2y ago
no u would have to use HashCore(), but because that not public u would have to implement the hashing algorithm urself as well
Surihia
SurihiaOP2y ago
sorry I don't know how to write my own hashing algorithm. is there a documentation page that I can refer to and learn more about this ?
cap5lut
cap5lut2y ago
this is just untested example code to show how to set it up to read the file contents and do hashing in parallel, the details u would have to work out urself i dunno how to implement it either, just search online, i guess there are tons of example implementations
Surihia
SurihiaOP2y ago
so when you mean implement a hashing algorithm, is it like make something new or use the existing SHA256 algorithm ?
cap5lut
cap5lut2y ago
maybe there is also just a package out there in the wild that has a faster implementation, then S.S.C.SHA256 yeah, just implement sha256 urself, so that u can hash chunk by chunk
Surihia
SurihiaOP2y ago
I would love to do that but I am time bound now. guess I will stick with the slow speed itself. I will try looking for a package that has a chunk by chunk system implemented.
reflectronic
reflectronic2y ago
if you are using .NET Framework then you are already giving up considerable performance have you tried using SHA256.Create("System.Security.Cryptography.SHA256Cng") instead of SHA256.Create()
Accord
Accord2y ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.
Want results from more Discord servers?
Add your server