jonny9f
RRunPod
•Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
From the docs, and testing, this works fine, curl https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <YOUR OPENAI API KEY>" \
-d '{
"model": "<YOUR DEPLOYED MODEL REPO/NAME>",
"messages": [
{
"role": "user",
"content": "Why is RunPod the best platform?"
}
],
"temperature": 0,
"max_tokens": 100
}'
16 replies
RRunPod
•Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
As they are there for sure.
16 replies
RRunPod
•Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
I think that must be the case. It's hardcoded to support those routes.
16 replies
RRunPod
•Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
How does the vllm worker do it then?
16 replies
RRunPod
•Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
looking good
14 replies
RRunPod
•Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
thanks, just back from family holidays. I will test it tomorrow.
14 replies
RRunPod
•Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
Great thanks. Enjoy the holidays !
14 replies
RRunPod
•Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":69778,"executionTime":1050,"id":"sync-386c0bf5-91b4-4b41-b1e5-1853a3b91698-e1","output":{"image":"","runtime":0.0000016689300537109375},"status":"COMPLETED"}
real 1m11.355s
user 0m0.053s
sys 0m0.000s
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":808,"executionTime":1046,"id":"sync-c4f93ca7-1efd-4055-a80e-7564f0cd92fc-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"}
real 0m2.037s
user 0m0.049s
sys 0m0.000s
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":99,"executionTime":1049,"id":"sync-1be960ae-3c94-4bfc-9161-c134b1d29646-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"}
real 0m1.338s
user 0m0.049s
sys 0m0.000s
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":181,"executionTime":1057,"id":"sync-47fcb8aa-db92-4a3b-9478-9954161b041a-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"}
real 0m1.445s
user 0m0.048s
sys 0m0.000s
14 replies
RRunPod
•Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
Thanks for the tip. I set it to one active worker. And it looks about the same ( see below )
14 replies