jonny9f
jonny9f
RRunPod
Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
agreed
16 replies
RRunPod
Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
From the docs, and testing, this works fine, curl https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer <YOUR OPENAI API KEY>" \ -d '{ "model": "<YOUR DEPLOYED MODEL REPO/NAME>", "messages": [ { "role": "user", "content": "Why is RunPod the best platform?" } ], "temperature": 0, "max_tokens": 100 }'
16 replies
RRunPod
Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
As they are there for sure.
16 replies
RRunPod
Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
I think that must be the case. It's hardcoded to support those routes.
16 replies
RRunPod
Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
How does the vllm worker do it then?
16 replies
RRunPod
Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
looking good
14 replies
RRunPod
Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
thanks, just back from family holidays. I will test it tomorrow.
14 replies
RRunPod
Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
Great thanks. Enjoy the holidays !
14 replies
RRunPod
Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}' {"delayTime":69778,"executionTime":1050,"id":"sync-386c0bf5-91b4-4b41-b1e5-1853a3b91698-e1","output":{"image":"","runtime":0.0000016689300537109375},"status":"COMPLETED"} real 1m11.355s user 0m0.053s sys 0m0.000s $ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}' {"delayTime":808,"executionTime":1046,"id":"sync-c4f93ca7-1efd-4055-a80e-7564f0cd92fc-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"} real 0m2.037s user 0m0.049s sys 0m0.000s $ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}' {"delayTime":99,"executionTime":1049,"id":"sync-1be960ae-3c94-4bfc-9161-c134b1d29646-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"} real 0m1.338s user 0m0.049s sys 0m0.000s $ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}' {"delayTime":181,"executionTime":1057,"id":"sync-47fcb8aa-db92-4a3b-9478-9954161b041a-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"} real 0m1.445s user 0m0.048s sys 0m0.000s
14 replies
RRunPod
Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
Thanks for the tip. I set it to one active worker. And it looks about the same ( see below )
14 replies