Armyk
RRunPod
•Created by Armyk on 5/30/2024 in #⚡|serverless
GGUF in serverless vLLM
How do I run a GGUF quantized model?
I need to run this LLM: https://huggingface.co/mradermacher/OpenBioLLM-Llama3-70B-GGUF
What parameters should I specify?
Thank you
58 replies