Serverless custom routes
Hi there. I'd like to implement by own streaming custom routes like the vllm worker ( https://github.com/runpod-workers/worker-vllm ). This worker supports routes like, https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1. How is this done? When I look in the source code that worker gets special keys passed to it in the rp handler like job_input.openai_route. Where does this key come from?
Thanks. Jon.
Thanks. Jon.
GitHub
GitHub - runpod-workers/worker-vllm: The RunPod worker template for...
The RunPod worker template for serving our large language model endpoints. Powered by vLLM. - runpod-workers/worker-vllm
9 Replies
You can't add custom routes in serverless.
You only use
/run
, /runsync
, /status
, /cancel
, etc. Nothing custom.How does the vllm worker do it then?
It doesn't, you are probably misunderstanding something.
Or else RunPod especially exposed
/openai
stuff just for vllm worker, but you can't add your own.I think that must be the case. It's hardcoded to support those routes.
As they are there for sure.
From the docs, and testing, this works fine, curl https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <YOUR OPENAI API KEY>" \
-d '{
"model": "<YOUR DEPLOYED MODEL REPO/NAME>",
"messages": [
{
"role": "user",
"content": "Why is RunPod the best platform?"
}
],
"temperature": 0,
"max_tokens": 100
}'
Yeah then thats added by RunPod especially for vllm. You can't add your own though.
Serverless typically just consists of a handler function.
Yes, we added the OpenAI custom route just for for openai compatibility for vLLM and in-progress tensorRT and text embedding workers
Would be nice to somehow proxy routes through to our endpoints though so that we don't have to use hacks like this.
Agreed, will bring that up
agreed