llm.mojo: GPT2 fine-tuning and inference in a single Mojo file

GitHub - dorjeduck/llm.mojo: port of Andrjey Karpathy's llm.c to Mojo
port of Andrjey Karpathy's llm.c to Mojo. Contribute to dorjeduck/llm.mojo development by creating an account on GitHub.
15 Replies
Jack Clayton
Jack ClaytonOP10mo ago
by @Martin Dudek
Martin Dudek
Martin Dudek3mo ago
Just updated it to 24.6 . A bit of a ride as it was on 24.4 but mostly straight forward DTypePointer -> UnsafePointer transitions and adding/changing of various imports. I have no further plans with this project but nice to have at least updated to the latest Mojo version ...
Robert3mo ago
Really cool project Martin. A friend of mine is actually working on the original llm.c project with that Karpathy guy cool stuff
Robert3mo ago
So uh did anyone take a notice @ Martin’s llm.mojo fork of the main llm.c project
No description
Martin Dudek
Martin Dudek3mo ago
Thanks @Robert - this is a 6 month old project, and actually mentioned on the Mojo language intro page https://www.modular.com/mojo , next to much cooler projects like Endia, Basalt and LightbugHTTP . Well the blessing of the name Andrej Karpathy got it there i guess 😂
Mojo 🔥: Programming language for all of AI
Mojo combines the usability of Python with the performance of C, unlocking unparalleled programmability of AI hardware and extensibility of AI models.
Martin Dudek
Martin Dudek3mo ago
Porting from C to Mojo is actually - at least for me - much easier than porting from Python. When Karpathy published llm.c i had time to make this port and he kindly early on added it to the notable ports section on the llm.c github page. He seems to be a really nice guy and I am big fan of the educational stuff he puts on youtube, so it was a pleasure for me to do this project.
Robert3mo ago
I don’t doubt it. My internet bro (lol) is 1 of the 3 main developers on the llm.c repo So you don’t plan on continuing this project? Can we fork the project and port it to something else. There is a ton of stuff going on in the Modular stack. Hope I can build or contribute to something in the stack half as cool
Martin Dudek
Martin Dudek3mo ago
There are plenty of ports, did you see https://github.com/karpathy/llm.c?tab=readme-ov-file#notable-forks Sure feel free to fork it and do what you want, it is there for the community to play around with. Following Andrej, I published it under the MIT license, that should also formally give you all the freedom you want. After the first implementation, i did not really touch it much anymore, except to make sure it runs with new stable Mojo versions. I am sure the code could be refined, but for me its basically a proof of concept project. If you want to port to another language, i would highly recommend to just go with the original C version ... if you are a Rust guy, that port looks very solid to me and its fast too. I havent looked at any of the other ports.
GitHub - karpathy/llm.c: LLM training in simple, raw C/CUDA
LLM training in simple, raw C/CUDA. Contribute to karpathy/llm.c development by creating an account on GitHub.
Robert3mo ago
@Martin Dudek I’m pretty keen on continuing your project for mojo. I don’t see much interest but I could work on it on the side
Martin Dudek
Martin Dudek3mo ago
Great . Curious what you make out of it. I don't rule out myself that I might pick it up at one point again - after all it's my mojo one hit wonder 😀 - but as of now I don't really see any interesting way to improve it without diverting significantly from the idea of the original llm.c .. please drop me a line when you publish something
Robert3mo ago
I am really in the ML/DL side of the space which is kind of from what I understand the llm.c project is about. But I will take a look at what you built out with the llm.mojo project and see how it goes. It is the holidays so maybe some time in the new year
Robert3mo ago
@Martin Dudek Hey Martin, so I've been bored over the holidays. Check it out.
No description
Martin Dudek
Martin Dudek3mo ago
I checked it out and came to the conclusion, you didn't read https://docs.modular.com/magic/ 😉 It's 'magic run .. ' not 'mojo run ... ' or 'magic shell' first and then you are in an 'env' in which you can run the 'mojo' command.
Get started with Magic | Modular Docs
Magic is a package manager and virtual environment manager for any language,
Robert3mo ago
@Martin Dudek Yo can I ping you? I understand you may be busy cause of the holidays On a side note the model does work I just didn’t take any screenshots. I was pretty much just benchmarking it at the moment. I ran the test script as well. Just had questions
Martin Dudek
Martin Dudek3mo ago
Sure feel free to ask questions i will reply whenever i find time. I probably won't feel like digging into all the details of this 6 month old project but sure i can help you with more fundamental questions about the Mojo aspect of it if you stuck. To understand llm.c , you said you have a buddy who is deeply involved in it, so you better ask him.

Did you find this page helpful?