M
Modular•15mo ago
Nicolay

What advantages does bring mojo for large models (whisper, llms)?

I would love to get a feel how to best use Mojo. I have a bunch of large models deployed and in training at the moment. So what benefits can I get in migrating pre-trained models to mojo? What are the benefits for pre-training, finetuning, PEFT, and in deployment? And if you want to take full advantage of the AI engine, how would you go about migrating the code (mostly implemented with huggingface)? Also feel free to link to any really interesting examples you have seen 🙂 Thanks a lot!
4 Replies
Heyitsmeguys
Heyitsmeguys•15mo ago
In terms of examples, there's llama2.mojo (https://github.com/tairov/llama2.mojo) which performs the fastest inference on CPU, even faster than llama2.c
GitHub
GitHub - tairov/llama2.mojo: Inference Llama 2 in one file of pure ...
Inference Llama 2 in one file of pure 🔥. Contribute to tairov/llama2.mojo development by creating an account on GitHub.
Ryulord
Ryulord•15mo ago
Mojo is still in a very early state. You can't use GPU's yet any there isn't much of an ecosystem aside from Python. Part of the reason there's no ecosystem yet is that major language features like traits are still missing. It'll be a while still before doing things in native Mojo make sense. If you call out to HF you'll just get the same performance you get in Python because you're just running Python code at that point.
Nicolay
NicolayOP•15mo ago
@Ryulord Thanks for the response. At the moment, I am just playing around with Mojo thinking about future use cases running stuff on Edge devices, so I would prefer being able to run it on the CPU. How realistic do you think it would be to run the training of PEFT models on the CPU?
Ryulord
Ryulord•14mo ago
Training PEFT models would definitely be doable but it'll potentially be quite slow depending on what model you have in mind. You could try doing a PEFT training run using huggingface on CPU to get a general idea of how bad it is.
Want results from more Discord servers?
Add your server