Can't run a 70B Llama 3.1 model on 2 A100 80 gb GPUs.
Hey, so I tired running the 70B llama model on 2gpu/worker but it keeps getting stuck at the same place every time but instead if I switch to the 8B model on 1 gpu/worker with a 48gb GPU, it works easily. The issue is coming with the 70B paramater model on 2 gpus/worker.
37 Replies
Maybe 70b needs 192gbs or smth like that
RunPod Blog
Run Larger LLMs on RunPod Serverless Than Ever Before - Llama-3 70B...
Up until now, RunPod has only supported using a single GPU in Serverless, with the exception of using two 48GB cards (which honestly didn't help, given the overhead involved in multi-GPU setups for LLMs.) You were effectively limited to what you could fit in 80GB, so you would essentially be
This blogpost said that 2 80GB are enough
yeah im not sure about the minimum requirements, maybe let me check
alright also how much network volume do you I think need for this?
maybe around 150~
alright thanks
let me know about the requirements
can you try other, gpu 4x
alr lemme try that
4090?
4x 48 gb
srry*
ok
np
It got stuck here again
It's always at this place
What do you think could be the problem @nerdylive
It went a bit further now
and now it just shifted to a different worker