StandingFuture Comments - Answer Overflow

StandingFuture

•Created by StandingFuture on 10/25/2024 in #⚡｜serverless

Does VLLM support quantized models?

I tried doing download directory to the quant model, but I see that the model says "Using llama.cpp release b3496 for quantization." and I don't see that as an option on runpod for the quantization method

2 replies

Gaming

Programming