R
RunPod2mo ago
Yasmin

Llama

Hello! For those who tried, how much GPU is needed for inference only, and for fine-tuning of Llama 70B? How about the inference of the 400B version (for knowledge distillation)? Is the quality difference worth it? Thanks!
Solution:
Only for inference
No description
Jump to solution
3 Replies
Solution
Kanoi
Kanoi2mo ago
Only for inference
No description
Kanoi
Kanoi2mo ago
And then to fine- tune the model you need 4x memory space. Hope it could help you.
Yasmin
YasminOP2mo ago
This is so helpful. Many thanks!
Want results from more Discord servers?
Add your server