Execution time much longer than delay time + actual time
Hello, I am running some tests with runpod and I can't seem to get the total execution time < 1 second.
I made a dummy handler that just returns immediately. The first time the delay time is +2 seconds as expected as the container is not hot. The delay then drops to 100ms or so. But the round trip execution time is still +1 second. What is the extra overhead here?
I've called the endpoint from two different machines on different networks and get the same results.
An example run is below.
Many thanks
Jon.
time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx -d '{"input": {"prompt": "test"}}'
{"delayTime":2052,"executionTime":1051,"id":"sync-684095d9-aaa9-4b55-96ea-a6e86e7f2f32-e1","output":{"image":"","runtime":0},"status":"COMPLETED"}
real 0m3.279s
user 0m0.050s
sys 0m0.000s
time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":100,"executionTime":1048,"id":"sync-a0c6793a-c811-4172-b5f0-1f321e72b33a-e1","output":{"image":"","runtime":0},"status":"COMPLETED"}
real 0m1.326s
user 0m0.039s
sys 0m0.011s
time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":100,"executionTime":1052,"id":"sync-f55598f6-09bc-4e40-b4b5-72bea6b86e99-e1","output":{"image":"","runtime":0},"status":"COMPLETED"}
real 0m1.327s
user 0m0.042s
sys 0m0.007s
8 Replies
Yes I have seen this as well. I time the running time my handler takes and usually the execution time was greater by 1+ second.
The last time I brought this up, I was recommended to set at least 1 active worker and do multiple requests to test and reproduce it. I have yet to do that but maybe you can try that ?
Thanks for the tip. I set it to one active worker. And it looks about the same ( see below )
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":69778,"executionTime":1050,"id":"sync-386c0bf5-91b4-4b41-b1e5-1853a3b91698-e1","output":{"image":"","runtime":0.0000016689300537109375},"status":"COMPLETED"}
real 1m11.355s
user 0m0.053s
sys 0m0.000s
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":808,"executionTime":1046,"id":"sync-c4f93ca7-1efd-4055-a80e-7564f0cd92fc-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"}
real 0m2.037s
user 0m0.049s
sys 0m0.000s
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":99,"executionTime":1049,"id":"sync-1be960ae-3c94-4bfc-9161-c134b1d29646-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"}
real 0m1.338s
user 0m0.049s
sys 0m0.000s
$ time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}'
{"delayTime":181,"executionTime":1057,"id":"sync-47fcb8aa-db92-4a3b-9478-9954161b041a-e1","output":{"image":"","runtime":0.0000019073486328125},"status":"COMPLETED"}
real 0m1.445s
user 0m0.048s
sys 0m0.000s
thanks I plan to look more into this once holidays are over
i just did a quick test, I do see much better results so will have to dig into whats different with yours
Great thanks. Enjoy the holidays !
@jonny9f this is fixed in latest sdk release
its fixed
Thx. Will test it after holidays.
Just glanced at the code, was it really just because of sleep(1) ? If so, that's hilarious ! π
I also see a rust binary. Is there a repo for it as well ? I couldn't find it. And does the new version support concurrency_handler ? The code seems to check and use config.max_concurrency instead.
It would be great if you could update the docs to reflect all these changes and features.
rust isnt introduced yet, its mostly hidden behind a flag, we plan to expose it in near future
thanks, just back from family holidays. I will test it tomorrow.
looking good