Mojo in llm?
Hello, guys! I am currently investigating the capabilities of the Mojo, especially in the context of pretraining large-scale models. Does anyone know if Mojo provides support for this kind of extensive pretraining? Additionally, I am looking for any recent research or case studies that discuss the use of Mojo for pretraining large models. If there are benchmarks or any comparisons with other frameworks, that would be particularly helpful. If you've had experience with Mojo in this area or know of resources that might point me in the right direction, I would greatly appreciate your insights. 👀
2 Replies
It doesn't even provide support for recursion yet 😅 so no, Mojo is nowhere near at the point where you could do large-scale LLM pretraining. AFAIK the only language that can do that is Python with C++
I have to write a 5-6 page paper for college with something about transformers and my idea was comparing some implementation of it in python and mojo.
Is this possible by now.?
I thought about using a mini llm and testing it locally on my macbook