GromitInWA Comments - Answer Overflow

GromitInWA

•Created by Bj9000 on 1/27/2025 in #⚡｜serverless

Serveless quants

I have the same question. Now that VLLM supports quants, I'm wondering if there's a way to specify it through an environment variable. Also, I'm not sure what format to use for the tokenizer path - is it the full path or just the top-level HF repo for the original model?

10 replies

Gaming

Programming