SvenBrnn
SvenBrnn
RRunPod
Created by Bj9000 on 1/27/2025 in #⚡|serverless
Serveless quants
i was also searching for it last week, i ended up giving up as there seems to be a ticket about this on the github of the vllm worker. https://github.com/runpod-workers/worker-vllm/issues/98 Doesn't seem on their tasklist anytime soon so i ended up building my own ollama based runner.
4 replies
RRunPod
Created by Mohamed Nagy on 1/26/2025 in #⚡|serverless
How to respond to the requests at https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1
9 replies