RunPod•6mo ago

Llama

Hello! For those who tried, how much GPU is needed for inference only, and for fine-tuning of Llama 70B? How about the inference of the 400B version (for knowledge distillation)? Is the quality difference worth it? Thanks!

Solution:

Only for inference

Jump to solution

3 Replies

Solution

Kanoi•6mo ago

Only for inference

Kanoi•6mo ago

And then to fine- tune the model you need 4x memory space. Hope it could help you.

YasminOP•6mo ago

This is so helpful. Many thanks!

Gaming

Programming

Llama

Did you find this page helpful?