guestavius
RRunPod
•Created by Armyk on 5/30/2024 in #⚡|serverless
GGUF in serverless vLLM
I ended up using KoboldCpp's runpod template for gguf, lol. And sharing with some people to spend less time idling. (I'm being an idiot, yes.)
58 replies
RRunPod
•Created by Armyk on 5/30/2024 in #⚡|serverless
GGUF in serverless vLLM
https://huggingface.co/gghfez/WizardLM-2-8x22B-Beige
This one caught my interest. No idea if it's good though.
58 replies
RRunPod
•Created by Armyk on 5/30/2024 in #⚡|serverless
GGUF in serverless vLLM
[redacted] I'm hoping not to need 360GB of VRAM to run an 8x22B.
Edit: Oh wait, that just means I can point a name/model-AWQ-or-GPTQ repository to serverless.
58 replies
RRunPod
•Created by Armyk on 5/30/2024 in #⚡|serverless
GGUF in serverless vLLM
I'm asking what that drop menu do.
58 replies