RunPod•3mo ago

How to respond to the requests at https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1

the openai input is in the job input, I extracted it and processes the request . when send the the response with yield or return it recived could you take a look at this [https://github.com/mohamednaji7/runpod-workers-scripts/blob/main/empty_test/test%20copy%203.py] [https://github.com/mohamednaji7/runpod-workers-scripts/blob/main/empty_test/handler.py] when I run made an request I got

   raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://api.runpod.ai/v2/nqe3wqry3h7noa/openai/v1/chat/completions

   raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://api.runpod.ai/v2/nqe3wqry3h7noa/openai/v1/chat/completions

9 Replies

Jason•3mo ago

Did you use an api key? if yes, what kind of permission did you select? oh wait.. you have to use a real model i think just test it with a real hf model that is supported by vllm

Mohamed NagyOP•3mo ago

I am doing a dummy test because I am building my worker and want to use unsloth instead of vllm and transformers I think its needs to route the /v1/chat/completion , waht do you think? I made a bunch of failed trials

Jason•3mo ago

I think the /openai/v1 endpoint is always exposed like in the vllm docs, you can try copying the input handler of vllm worker from runpod to try from there first In real endpoints* I'm not sure if it does the same in local

SvenBrnn•3mo ago

Just have a look at https://github.com/SvenBrnn/runpod-worker-ollama/tree/master/wrapper/src I've build this last week with trying around on a CPU instance and figured out how stuff comes in when it comes from an OpenAI endpoint: https://github.com/SvenBrnn/runpod-worker-ollama/blob/master/test_inputs/openai_completion.json https://github.com/SvenBrnn/runpod-worker-ollama/blob/master/test_inputs/openai_get_models.json

GitHub

runpod-worker-ollama/wrapper/src at master · SvenBrnn/runpod-worker...

A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.

GitHub

runpod-worker-ollama/test_inputs/openai_completion.json at master ·...

A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.

GitHub

runpod-worker-ollama/test_inputs/openai_get_models.json at master ·...

A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.

Mohamed NagyOP•3mo ago

is it works wit you? a CPU instance is it on RunPod cloud

Jason•3mo ago

Yes

Mohamed NagyOP•3mo ago

Nice, I will test it in the next version

SvenBrnn•3mo ago

I am running this on a GPU instance now, hovever i used a CPU instance figuring out with print's what its getting sent with different OpenAI commands, together with the code of the VLLM worker i figured out how to fully integrate my own worker with OpenAI endpoint Just a way to save some money, testing can be bit expensive espessally when you don't want to really run anything but only find out how stuff works internly @Mohamed Nagy you can have a look at my two json files to see how the /openapi/xxx are sent to the input, i figured it was a good way to just add these to my repo for later testing 😂

Mohamed NagyOP•3mo ago

nice, testing is very diffcult using runpod, for you furtur tests you could doas I do, I run the worker from a remote repo and change the remote repo code nice, I will test my instance with openai client requests

Gaming

Programming

How to respond to the requests at https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1

Did you find this page helpful?