Bj9000
Bj9000
RRunPod
Created by Bj9000 on 1/27/2025 in #⚡|serverless
Serveless quants
Hi, how do you specify a specific gguf quant file from a hf repo when configuring a vllm serveless endpoint? Only seems to let you specify the repo level.
4 replies