R
RunPod7mo ago
octopus

Plans to support 400B models like llama 3?

Is runpod thinking about how they will support vvllms like 400B Llama model that is expected to release later this year?
8 Replies
nerdylive
nerdylive7mo ago
if it is supported by vllm then sure it will work or there is a code that can run that yes it will also work
digigoblin
digigoblin7mo ago
I doubt 2 x 80GB are sufficient to load a 400B model.
nerdylive
nerdylive7mo ago
I'm not sure also how much runpod will allow for the gpu limits in the future
digigoblin
digigoblin7mo ago
I see for 48GB tier, you can have up to 10 GPU per worker which is cool.
nerdylive
nerdylive7mo ago
Great That should work in serverless too then
digigoblin
digigoblin7mo ago
I am referring to serverless.
nerdylive
nerdylive7mo ago
Yep Ic
Alpay Ariyak
Alpay Ariyak7mo ago
we're pretty far from 400B release afaik, limits will likely be different then
Want results from more Discord servers?
Add your server