RunPod•11mo ago

Plans to support 400B models like llama 3?

Is runpod thinking about how they will support vvllms like 400B Llama model that is expected to release later this year?

8 Replies

Jason•11mo ago

if it is supported by vllm then sure it will work or there is a code that can run that yes it will also work

digigoblin•11mo ago

I doubt 2 x 80GB are sufficient to load a 400B model.

Jason•11mo ago

I'm not sure also how much runpod will allow for the gpu limits in the future

digigoblin•11mo ago

I see for 48GB tier, you can have up to 10 GPU per worker which is cool.

Jason•11mo ago

Great That should work in serverless too then

digigoblin•11mo ago

I am referring to serverless.

Jason•11mo ago

Yep Ic

we're pretty far from 400B release afaik, limits will likely be different then