R
RunPod2mo ago
octopus

Plans to support 400B models like llama 3?

Is runpod thinking about how they will support vvllms like 400B Llama model that is expected to release later this year?
8 Replies
nerdylive
nerdylive2mo ago
if it is supported by vllm then sure it will work or there is a code that can run that yes it will also work
digigoblin
digigoblin2mo ago
I doubt 2 x 80GB are sufficient to load a 400B model.
nerdylive
nerdylive2mo ago
I'm not sure also how much runpod will allow for the gpu limits in the future
digigoblin
digigoblin2mo ago
I see for 48GB tier, you can have up to 10 GPU per worker which is cool.
nerdylive
nerdylive2mo ago
Great That should work in serverless too then
digigoblin
digigoblin2mo ago
I am referring to serverless.
nerdylive
nerdylive2mo ago
Yep Ic
Alpay Ariyak
Alpay Ariyak2mo ago
we're pretty far from 400B release afaik, limits will likely be different then