M
Modular7mo ago
benny

SHA256

Any implementations of SHA256 algorithm in Mojo yet, I've seen other work about other hashing algorithms and obviously the builtin, but has anyone implemented this?
28 Replies
brainiacoutcast
brainiacoutcast7mo ago
i may as well go for it as a learning project over the weekend, did it in another language recently so it's still fairly fresh in my mind (someone more familiar probably should too though)
benny
benny7mo ago
if you get a working model started and dm me i would love to optimize it / improve for mojo i just don’t have starting experience with that algorithm, so i’m hesitant to make my own implementation
Maxim
Maxim7mo ago
I did md5 and could try SHA256, but don’t have that much time on my hands, need to finish up other projects . So it’s better if @brainiacoutcast will take a stab at it. @brainiacoutcast let me know if I can help though.
benny
benny7mo ago
didn’t see your repo until now, might also just make a pr using that
getting better
getting better7mo ago
What's wrong with a simple c ffi for it?
Maxim
Maxim7mo ago
External dependency and complexity which comes with it.
brainiacoutcast
brainiacoutcast7mo ago
well lets see what i can pull outta my behind in 2 hours, its a pretty simple algo to write if you already know the language you're using for future reference, on one core of a mid range laptop you should be able to chew through 250-350 megabytes per second with an efficient straightforward cross-platform implementation, but some chips can go way faster with dedicated instructions for some steps here's what i managed, it should be decent performance wise but there's plenty left on the table (particularly if you want to do some feature detection for sha instructions, lol) unfortunately, it's rigorously untested, but i'm through fighting with wsl overall this wasn't too tough to write even on day 1, albeit a little boilerplatey - i'll come back for more mojo when windows sdk is ready, maybe there will be some more convenience methods by then too edit: removed broken code
benny
benny7mo ago
i’ll give it a look when i’m home and try to improve, thanks
Maxim
Maxim7mo ago
I will have a look as well, thanks @brainiacoutcast
brainiacoutcast
brainiacoutcast7mo ago
emphasis on rigorously untested, can't wait to see what you all come up with though
benny
benny7mo ago
maxim will definitly have more to contribute algorithm wise on this one than me, but i’ll lyk once i’ve tried
brainiacoutcast
brainiacoutcast7mo ago
worked out all the bugs as far as i can tell, added a crude benchmark to main - performs pretty well for having to copy everything
brainiacoutcast
brainiacoutcast7mo ago
granular loop unrolling is very nice to have
Michael K
Michael K7mo ago
very nice. I saw your note about copying bytes to DynamicVector. I think you want to do this:
var bytes = DynamicVector[UInt8](byte_view.dynamic_size + 1024)
var bytes_ptr = DTypePointer[DType.uint8, 0](bytes.data.value)
let one_bit: UInt8 = 0b1000_0000
memcpy[DType.uint8](bytes_ptr, byte_view.data, byte_view.dynamic_size)
bytes.size = byte_view.dynamic_size
var bytes = DynamicVector[UInt8](byte_view.dynamic_size + 1024)
var bytes_ptr = DTypePointer[DType.uint8, 0](bytes.data.value)
let one_bit: UInt8 = 0b1000_0000
memcpy[DType.uint8](bytes_ptr, byte_view.data, byte_view.dynamic_size)
bytes.size = byte_view.dynamic_size
This has an address_space inconsistency so you also need to make sure that the argument byte_view specifies the generic address_space:
fn sha256(byte_view: Buffer[_, DType.uint8, 0]) -> InlinedFixedVector[UInt8, 32]:
fn sha256(byte_view: Buffer[_, DType.uint8, 0]) -> InlinedFixedVector[UInt8, 32]:
on my machine this went from 359 megabytes per second to 411. There is also an @unroll missing beforefor dword_i in range(16): that gets a pretty good speed up on my machine. 411 -> 456.
brainiacoutcast
brainiacoutcast7mo ago
that's nuts, what cpu and clock speed are you running with? re: copying ideally i'd like to have no dynamicvector inside the sha function at all, just wanted to read the buffer as-is for most of it and then use an InlinedFixedVector to manage the tail, but it's nice to see that it went a little faster from calling a builtin
Want results from more Discord servers?
Add your server