C
C#16mo ago
Surihia

❔ Increase Hash computing speed.

So I am trying to make a simple app for my use case that computes SHA256 of two files. right now I have this following code which works but is extremely slow:
static void Main(string[] args)
{
var inFile_1 = args[0];
var inFile_2 = args[1];

Console.WriteLine("Computing hash....");
using (FileStream fs = new FileStream(inFile_1, FileMode.Open, FileAccess.Read))
{
using (SHA256 mySHA256 = SHA256.Create())
{
fs.Position = 0;
byte[] HashBuffer = new byte[4096];

HashBuffer = mySHA256.ComputeHash(fs);

Console.WriteLine(BitConverter.ToString(HashBuffer).Replace("-", ""));
Console.ReadLine();
}
}
}
static void Main(string[] args)
{
var inFile_1 = args[0];
var inFile_2 = args[1];

Console.WriteLine("Computing hash....");
using (FileStream fs = new FileStream(inFile_1, FileMode.Open, FileAccess.Read))
{
using (SHA256 mySHA256 = SHA256.Create())
{
fs.Position = 0;
byte[] HashBuffer = new byte[4096];

HashBuffer = mySHA256.ComputeHash(fs);

Console.WriteLine(BitConverter.ToString(HashBuffer).Replace("-", ""));
Console.ReadLine();
}
}
}
I have only set it to get the hash for one file. my issue is the speed of the hash computing is extremely slow. is there any way by which I can increase the speed ?
18 Replies
cap5lut
cap5lut16mo ago
i dont see much u could improve: fs.Position = 0; can be removed as u are always starting at pos 0 when reading the file here u are creating a 4kb array, just to replace it with the array from the hasher
byte[] HashBuffer = new byte[4096];
HashBuffer = mySHA256.ComputeHash(fs);
byte[] HashBuffer = new byte[4096];
HashBuffer = mySHA256.ComputeHash(fs);
so byte[] hash = mySHA256.ComputeHash(fs); would be enough besides that i dont think you can improve anything else that easily and the most impact comes probably from the file size and generally reading from the disk
Surihia
Surihia16mo ago
Yeah getting a large file's hash is the part tht is quite slow. I want the speed to atleast be on par with 7zip's CRC-SHA>SHA256 option and the current code is slower than tht.
cap5lut
cap5lut16mo ago
the only thing i can think of, would be to do it async and in chunks. basically read a chunk of the file, fire off reading the next chunk async and while that task is running compute the hash of the already read chunk not sure if u can do that with that SHA256 implementation or if u have to roll ur own basically start computing the hash while continuing to read the file
Surihia
Surihia16mo ago
you mean like doing it in a while loop with a custom bytes to read value per chunk ? I was thinking of using a buffered stream and set a large buffer length instead of being relegated to just 4kb.
cap5lut
cap5lut16mo ago
that wont work, because it still will either read or compute the hash when u look at the implementation:
public byte[] ComputeHash(Stream inputStream)
{
ObjectDisposedException.ThrowIf(_disposed, this);

// Use ArrayPool.Shared instead of CryptoPool because the array is passed out.
byte[] buffer = ArrayPool<byte>.Shared.Rent(4096);

int bytesRead;
int clearLimit = 0;

while ((bytesRead = inputStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (bytesRead > clearLimit)
{
clearLimit = bytesRead;
}

HashCore(buffer, 0, bytesRead);
}

CryptographicOperations.ZeroMemory(buffer.AsSpan(0, clearLimit));
ArrayPool<byte>.Shared.Return(buffer, clearArray: false);
return CaptureHashCodeAndReinitialize();
}
public byte[] ComputeHash(Stream inputStream)
{
ObjectDisposedException.ThrowIf(_disposed, this);

// Use ArrayPool.Shared instead of CryptoPool because the array is passed out.
byte[] buffer = ArrayPool<byte>.Shared.Rent(4096);

int bytesRead;
int clearLimit = 0;

while ((bytesRead = inputStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (bytesRead > clearLimit)
{
clearLimit = bytesRead;
}

HashCore(buffer, 0, bytesRead);
}

CryptographicOperations.ZeroMemory(buffer.AsSpan(0, clearLimit));
ArrayPool<byte>.Shared.Return(buffer, clearArray: false);
return CaptureHashCodeAndReinitialize();
}
but u want to do both at the same time basically in the while loop u would not read directly from the stream, but from some kind of list/queue that contains the chunks, while another thread/an async running task is reading from the stream and filling the list/queue that approach would have a little overhead because of synchronizing between the 2 threads and would only save some time while the HashCore method is computing that chunk
Surihia
Surihia16mo ago
getting lot of errors with this code. does this work on a framework project ?
cap5lut
cap5lut16mo ago
thats the dotnet 7 version of SHA256.ComputeHash, but u would not want to use that code anyway, as it is synchronously reading a chunk from the stream and then computing the hash not sure whats all available on net 4.6 framework well the main issue is still speeding up the hashing + file reading u would want something like this for doing both in parallel:
public async Task<byte[]> ComputeHashAsync(Stream inputStream)
{
byte[] result = new byte[32];
Task hashingTask = Task.CompletedTask;

while (true)
{
var buffer = new byte[4096];
var read = await inputStream.ReadAsync(buffer, 0, buffer.Length);
if (read == 0) break;
hashingTask = hashingTask.ContinueWith(t => {
DoHashing(buffer, 0, read, result);
});
}
await hashingTask;
return result;
}
public async Task<byte[]> ComputeHashAsync(Stream inputStream)
{
byte[] result = new byte[32];
Task hashingTask = Task.CompletedTask;

while (true)
{
var buffer = new byte[4096];
var read = await inputStream.ReadAsync(buffer, 0, buffer.Length);
if (read == 0) break;
hashingTask = hashingTask.ContinueWith(t => {
DoHashing(buffer, 0, read, result);
});
}
await hashingTask;
return result;
}
iirc ArrayPool isnt available in framework, so i simply create a new byte array for each chunk DoHashing() would would be like the HashCore() method, just that i passed the result array as parameter as well i also omitted the cancellation token stuff to show the important stuff note also that this totally disregards memory consumption, if u want to limited it to eg, have only 5 chunks in memory at the same time u could use a SemaphoreSlim for that
Surihia
Surihia16mo ago
is DoHashing supposed to be a method ?
cap5lut
cap5lut16mo ago
one u would have to implement ;p its just a place holder to show where the hashing belongs
Surihia
Surihia16mo ago
oh so I would have to put this method in the DoHashing method then ?
using (SHA256 mySHA256 = SHA256.Create())
{
var hashBuffer = mySHA256.ComputeHash(fs);
hashString = BitConverter.ToString(hashBuffer).Replace("-", "");
Console.WriteLine(hashString);
}
using (SHA256 mySHA256 = SHA256.Create())
{
var hashBuffer = mySHA256.ComputeHash(fs);
hashString = BitConverter.ToString(hashBuffer).Replace("-", "");
Console.WriteLine(hashString);
}
cap5lut
cap5lut16mo ago
no u would have to use HashCore(), but because that not public u would have to implement the hashing algorithm urself as well
Surihia
Surihia16mo ago
sorry I don't know how to write my own hashing algorithm. is there a documentation page that I can refer to and learn more about this ?
cap5lut
cap5lut16mo ago
this is just untested example code to show how to set it up to read the file contents and do hashing in parallel, the details u would have to work out urself i dunno how to implement it either, just search online, i guess there are tons of example implementations
Surihia
Surihia16mo ago
so when you mean implement a hashing algorithm, is it like make something new or use the existing SHA256 algorithm ?
cap5lut
cap5lut16mo ago
maybe there is also just a package out there in the wild that has a faster implementation, then S.S.C.SHA256 yeah, just implement sha256 urself, so that u can hash chunk by chunk
Surihia
Surihia16mo ago
I would love to do that but I am time bound now. guess I will stick with the slow speed itself. I will try looking for a package that has a chunk by chunk system implemented.
reflectronic
reflectronic16mo ago
if you are using .NET Framework then you are already giving up considerable performance have you tried using SHA256.Create("System.Security.Cryptography.SHA256Cng") instead of SHA256.Create()
Accord
Accord16mo ago
Was this issue resolved? If so, run /close - otherwise I will mark this as stale and this post will be archived until there is new activity.
Want results from more Discord servers?
Add your server
More Posts
❔ Redirect all unauthenticated requests for an SPA to OAuthI have an Angular SPA app. I don't want them to be able to load any of the assets on this server wi❔ Ignore fallback folders that don't exist in a NuGet config (MSB4018)?I installed a "portable" .NET build of Godot that leaves behind a file `%APPDATA%/NuGet/config/Godot❔ User relations and retrieving that data in Razor PagesI come from a Laravel (PHP) and JS/TS background. Assume I have a `Todo` index page and I want to feHow to create a DOOM styled raycaster in C#. Just like this, https://github.com/sinshu/managed-doomI have no idea what framework i wanna use, Not unity tho. Any type of help is amazing.❔ How to create a subpage by codeHello, sorry for the dumb question, but its my first attempt to create a website. Is it possible to ✅ What is a Discriminator?Can someone please expand a bit more on what a Discriminator specifically is? There is some detail i❔ GUI tabs not showingThe gui tabs that i am using (Unity ImGui) aren't showing on the gui when opened.❔ Synchronise an object across two threads?Hello. I have an object that I use to wait listen for a message from a TcpConnection. However, while❔ Ok to reference both a nuget package, and the project source it's built from while I make changes?I'm wondering about the best practice handling a project that I am deploying privately as a nuget pa❔ VS SQL Server Object Explorer Default Click BehaviorIs there an option for changing the on click behavior of items in the SQL Server Object Explorer lik