Modular•12mo ago

Basalt: ML Framework

https://github.com/basalt-org/basalt

GitHub

GitHub - basalt-org/basalt: A Machine Learning framework from scrat...

A Machine Learning framework from scratch in Pure Mojo 🔥 - basalt-org/basalt

26 Replies

Jack ClaytonOP•12mo ago

by @benny, @Stijn, @NKspartan, and @fnands

Ferdinand Schenck•12mo ago

Well, mostly the first three with some small contributions by me 🙃

benny•12mo ago

making steady progress towards another release, we are always looking for more people interested in helping move the project along :)

Yosef Frost•12mo ago

Great. Working on implementing some activation functions (draft PR open). Are there any I should prioritize? Also, is there currently a place to write the tests for the backwards passes of the activation functions?

benny•12mo ago

Awesome @Yosi Frost I dont think we have any priorities with specific activation functions, whatever you can think of works. Tests for activation fuctions should go in /tests/mojo/test_activations.mojo

Yosef Frost•12mo ago

Great! Is tests/test_activations.mojo just for forwards or does it also include tests for the backwards pass?

benny•12mo ago

it should also include backwards tests

Yosef Frost•12mo ago

Ok. I must have missed them. Will write those tests as well. Thank you!

Josiah•12mo ago

I'm interested in dataloading and made a comment in https://github.com/basalt-org/basalt/issues/90#issuecomment-2127550998. I agree with what as posted on one of the other channels about not licking the cookie though. I'm curious what frameworks people have used for dataloading / what they like / dont like. To me I think mojo still needs Iterable / Gettable traits to make transforms/pipes possible to even prototype.

benny•12mo ago

you could easily make both of those traits now with current Mojo, im not sure I fully understand the question

Josiah•12mo ago

I think there are still features needed (?). I saw this thread: https://discord.com/channels/1087530497313357884/1224434323193594059/1238338296699158598 which gave me the impression its not possible yet

benny•12mo ago

while that is correct, because of how Basalt works right now, dtype is accessible for any module, so you don’t need a generic trait to return a Scalar[dtype] but i’m not sure if this would change for your use case

pedro•12mo ago

Out of curiosity, is the source code for basalt based on any other frameworks or whitepapers or is it just from first principles? I noticed it has vague similarities to tinygrad but not enough to be recognizable as a port

Stijn•12mo ago

There are definitely some influences from other frameworks, but it's also very much tailored to whatever Mojo allows. You'll should recognize stuff from pytorch, tinygrad, mlx. Those are probably the ones that are most looked at

benny•11mo ago

Giving a talk at todays community meeting if anyone is interested, the video will be posted after :)

MRiabov•11mo ago

about that comparison: is that pytorch running on python or is it pytorch running on mojo? I mean, if basalt is 2x slower than pytorch on a language 5-10 times faster at least, I'd avoid it for now. (I know there is a lot of improvement coming, but still)

The Professor•11mo ago

I assume PyTorch on Python, but that shouldn't dissuade you. The vast majority of heavy lifting in PyTorch is offloaded to faster languages. It has had far longer to make optimizations, was originally made by Facebook, and has way more contributors. Basalt is heavily disadvantaged in this comparison.

benny•11mo ago

Pytorch on Python. If you want a stable framework use Pytorch, nobody is offended. I can promise you however that in 6-12 months your model will run almost exactly the same speed as it does today, I cannot say the same for Basalt

Melody Daniel•11mo ago

Isn't core Pytorch written in C and Fortran and optimised for more than a decade now?

Serg Gini•11mo ago

No PyTorch is a first-class bindings to libtorch, which is C++ library. And it is kinda optimised, but bloated and not very great quality of code... That's why there are some other solutions that outperform it

Martin Dudek•11mo ago

@benny , I am really impressed with the performance comparison between Basalt and PyTorch. Congratulations on the achievement! I'm curious about how you guys managed to reach such impressive performance levels. I noticed that Basalt includes some advanced tensor routines. Are these routines, such as the matmul, equally performant to Pytorch's torch.matmul? I'm asking because I just had the sort of hilarious insight that the simple vectorized matmul routine I wrote for my KAN experiments is around 100 times slower than torch.matmul. 🤯 Thx

benny•4mo ago

Hey Martin, thanks 😊 It was kind of a joint effort between all the contributors, but it took probably 100 hours and a bunch of failed attempts, we used a bunch of old/new research papers and some novel ideas to try and block it more efficiently. That being said there is rumored to be a ~3000 line kernel used in the MAX engine from the modular team that’s even faster (closed source + unrealistic for us to implement atm) If you want more details your welcome to dm me and I can explain some of the nuances but most of the info can be found with a couple google scholar searches Anyone interested in revisiting this with me?

TilliFe•4mo ago

Hi Benny, great to hear from you. 🤗 May I ask, what are your objectives with this project? Where do you see Basalt going from here, given the current Mojo (with MAX) landscape? Do you see potential in merging Endia with Basalt in the near future to build a first class AD framework in Mojo?

benny•4mo ago

While I do agree that the projects have a lot in common - Basalt is centered on static graph / compile time optimizations - It’s my understanding that Endia is more general ML tasks? Ideally Basalt would be an object oriented framework for building Neural Networks - no more or less

PhoToN 旭輝•4mo ago

@benny Hi! I am interested, I recently updated most of Basalt to work with Mojo 24.6. Let me know, I can make a PR! or I would be glad to work on something new in there.

benny•4mo ago

Please make a PR🙀🙀🙀🙀 that is amazing there are some other short term things we could do like use the mojo marathon matmul winning entry for the kernel

Gaming

Programming

Basalt: ML Framework

Did you find this page helpful?