Armyk
Armyk
RRunPod
Created by Armyk on 5/30/2024 in #⚡|serverless
GGUF in serverless vLLM
How do I run a GGUF quantized model? I need to run this LLM: https://huggingface.co/mradermacher/OpenBioLLM-Llama3-70B-GGUF What parameters should I specify? Thank you
58 replies