StandingFuture
RRunPod
•Created by StandingFuture on 10/25/2024 in #⚡|serverless
Does VLLM support quantized models?
Trying to figure out how to deploy this, but I didn't see an option for selecting which quantization I wanted to run. https://huggingface.co/bartowski/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored-GGUF Thanks!
2 replies