R
RunPod3mo ago
baldo

H100 NVL

If I've understood the docs correctly, H100 NVL is not available on serverless. Are there any plans to bring it to serverless? The extra 14GB of VRAM over the other GPUs is pretty useful for 70(ish)B parameter LLMs.
4 Replies
yhlong00000
yhlong000003mo ago
you can try 4*48
baldo
baldoOP3mo ago
I'm specifically interested in an 8-bit quant of Qwen2.5 72B, which uses 77GB of VRAM, leaving very little overhead with a single 80GB GPU I estimated 2x RTX 6000 Ada to be the cheapest, but I can see that the 48GB PRO option lists 3 GPUs: L40, L40S, RTX 6000 Ada. Is there any way to pick which one to use or is the allocation just random?
yhlong00000
yhlong000003mo ago
yes, you can
No description
baldo
baldoOP3mo ago
ah ok, didn't really expect it to be there thanks a lot
Want results from more Discord servers?
Add your server