RunPod•7mo ago

H100 NVL

If I've understood the docs correctly, H100 NVL is not available on serverless. Are there any plans to bring it to serverless? The extra 14GB of VRAM over the other GPUs is pretty useful for 70(ish)B parameter LLMs.

4 Replies

yhlong00000•7mo ago

you can try 4*48

baldoOP•7mo ago

I'm specifically interested in an 8-bit quant of Qwen2.5 72B, which uses 77GB of VRAM, leaving very little overhead with a single 80GB GPU I estimated 2x RTX 6000 Ada to be the cheapest, but I can see that the 48GB PRO option lists 3 GPUs: L40, L40S, RTX 6000 Ada. Is there any way to pick which one to use or is the allocation just random?

yhlong00000•7mo ago

yes, you can

baldoOP•7mo ago

ah ok, didn't really expect it to be there thanks a lot

Gaming

Programming

H100 NVL

Did you find this page helpful?