openai/v1 and open-webui
Hey Team,
Looking at your docs, and at the question "How to respond to the requests at https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1"; I've run into a weird gotcha. When I do a GET ---
it gives me an
Most applications (like open-webui) that use the openai spec expect this to be a GET (see openai docs -- https://platform.openai.com/docs/api-reference/models) and the docs imply that it is - https://github.com/runpod-workers/worker-vllm/tree/main#modifying-your-openai-codebase-to-use-your-deployed-vllm-worker. Am I missing something, how is this supposed to work? Thanks, Paul
Most applications (like open-webui) that use the openai spec expect this to be a GET (see openai docs -- https://platform.openai.com/docs/api-reference/models) and the docs imply that it is - https://github.com/runpod-workers/worker-vllm/tree/main#modifying-your-openai-codebase-to-use-your-deployed-vllm-worker. Am I missing something, how is this supposed to work? Thanks, Paul
GitHub
GitHub - runpod-workers/worker-vllm: The RunPod worker template for...
The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm
1 Reply
https://meyer-laurent.com/deploying-deepseek-r1-on-runpod-serverless-and-use-it-in-pycharm#2-understanding-vllm
This is the answer, apparently it won't load the models without write permissions. Seems a bit silly, but hopefully this helps someone else
Deploying DeepSeek-R1 on Runpod Serverless and use it in PyCharm
Deploying DeepSeek-R1 on Runpod Serverless and use it in PyCharm
Deploy DeepSeek-R1 on Runpod Serverless with Docker & vLLM—run your own private local LLM that processes RAG data, auto-scales, and shuts down when idle.