How to respond to the requests at https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1
the openai input is in the job input, I extracted it and processes the request . when send the the response with yield or return it recived
could you take a look at this
[https://github.com/mohamednaji7/runpod-workers-scripts/blob/main/empty_test/test%20copy%203.py]
[https://github.com/mohamednaji7/runpod-workers-scripts/blob/main/empty_test/handler.py]
when I run made an request I got
9 Replies
Did you use an api key? if yes, what kind of permission did you select?
oh wait.. you have to use a real model i think
just test it with a real hf model that is supported by vllm
I am doing a dummy test because I am building my worker and want to use unsloth instead of vllm and transformers
I think its needs to route the
/v1/chat/completion
, waht do you think?
I made a bunch of failed trialsI think the /openai/v1 endpoint is always exposed like in the vllm docs, you can try copying the input handler of vllm worker from runpod to try from there first
In real endpoints* I'm not sure if it does the same in local
Just have a look at https://github.com/SvenBrnn/runpod-worker-ollama/tree/master/wrapper/src
I've build this last week with trying around on a CPU instance and figured out how stuff comes in when it comes from an OpenAI endpoint:
https://github.com/SvenBrnn/runpod-worker-ollama/blob/master/test_inputs/openai_completion.json
https://github.com/SvenBrnn/runpod-worker-ollama/blob/master/test_inputs/openai_get_models.json
GitHub
runpod-worker-ollama/wrapper/src at master · SvenBrnn/runpod-worker...
A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.
GitHub
runpod-worker-ollama/test_inputs/openai_completion.json at master ·...
A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.
GitHub
runpod-worker-ollama/test_inputs/openai_get_models.json at master ·...
A serverless ollama worker for runpod.io. Contribute to SvenBrnn/runpod-worker-ollama development by creating an account on GitHub.
is it works wit you?
a CPU instance
is it on RunPod cloudYes
Nice, I will test it in the next version
I am running this on a GPU instance now, hovever i used a CPU instance figuring out with print's what its getting sent with different OpenAI commands, together with the code of the VLLM worker i figured out how to fully integrate my own worker with OpenAI endpoint
Just a way to save some money, testing can be bit expensive espessally when you don't want to really run anything but only find out how stuff works internly
@Mohamed Nagy you can have a look at my two json files to see how the /openapi/xxx are sent to the input, i figured it was a good way to just add these to my repo for later testing 😂
nice, testing is very diffcult using runpod, for you furtur tests you could doas I do, I run the worker from a remote repo and change the remote repo code
nice, I will test my instance with openai client requests