R
RunPod11mo ago
octopus

Plans to support 400B models like llama 3?

Is runpod thinking about how they will support vvllms like 400B Llama model that is expected to release later this year?
8 Replies
Jason
Jason11mo ago
if it is supported by vllm then sure it will work or there is a code that can run that yes it will also work
digigoblin
digigoblin11mo ago
I doubt 2 x 80GB are sufficient to load a 400B model.
Jason
Jason11mo ago
I'm not sure also how much runpod will allow for the gpu limits in the future
digigoblin
digigoblin11mo ago
I see for 48GB tier, you can have up to 10 GPU per worker which is cool.
Jason
Jason11mo ago
Great That should work in serverless too then
digigoblin
digigoblin11mo ago
I am referring to serverless.
Jason
Jason11mo ago
Yep Ic
Alpay Ariyak
Alpay Ariyak11mo ago
we're pretty far from 400B release afaik, limits will likely be different then

Did you find this page helpful?