R
RunPod3mo ago
artbred

GGUF vllm

It seems that the newest version of vllm's supports gguf models, have anyone figured out how to make this work in runpod serverless? Seems like need to set some custom ENV vars, or maybe anyone knows a way to convert gguf back to safetensors?
1 Reply
nerdylive
nerdylive3mo ago
have you resolved this yet? try looking in quick deploy settings or vllm documentations then match it with the quick deploy settings
Want results from more Discord servers?
Add your server