jonny9f
jonny9f
RRunPod
Created by jonny9f on 3/15/2024 in #⚡|serverless
Inconsistent delay time with generator worker
I am getting very inconsistent delay times when running serverless with a generator handler. I have the expected delay if there is cold-start but then once a worker has started I then still get delay times ranging from 1 - 30 seconds. I would expect once the worker is started for the delay time to be low and consistent. I have max workers 2 and scale on number of requests in the queue. What's going on here?
2 replies
RRunPod
Created by jonny9f on 3/11/2024 in #⚡|serverless
Serverless custom routes
Hi there. I'd like to implement by own streaming custom routes like the vllm worker ( https://github.com/runpod-workers/worker-vllm ). This worker supports routes like, https://api.runpod.ai/v2/<YOUR ENDPOINT ID>/openai/v1. How is this done? When I look in the source code that worker gets special keys passed to it in the rp handler like job_input.openai_route. Where does this key come from?
Thanks. Jon.
16 replies
RRunPod
Created by jonny9f on 12/24/2023 in #⚡|serverless
Execution time much longer than delay time + actual time
Hello, I am running some tests with runpod and I can't seem to get the total execution time < 1 second. I made a dummy handler that just returns immediately. The first time the delay time is +2 seconds as expected as the container is not hot. The delay then drops to 100ms or so. But the round trip execution time is still +1 second. What is the extra overhead here? I've called the endpoint from two different machines on different networks and get the same results. An example run is below. Many thanks Jon. time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx -d '{"input": {"prompt": "test"}}' {"delayTime":2052,"executionTime":1051,"id":"sync-684095d9-aaa9-4b55-96ea-a6e86e7f2f32-e1","output":{"image":"","runtime":0},"status":"COMPLETED"} real 0m3.279s user 0m0.050s sys 0m0.000s time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}' {"delayTime":100,"executionTime":1048,"id":"sync-a0c6793a-c811-4172-b5f0-1f321e72b33a-e1","output":{"image":"","runtime":0},"status":"COMPLETED"} real 0m1.326s user 0m0.039s sys 0m0.011s time curl -X POST "https://api.runpod.ai/v2/wrn9f44a9bgjl0/runsync" -H 'Content-Type: application/json' -H 'Authorization: Bearer xxx' -d '{"input": {"prompt": "test"}}' {"delayTime":100,"executionTime":1052,"id":"sync-f55598f6-09bc-4e40-b4b5-72bea6b86e99-e1","output":{"image":"","runtime":0},"status":"COMPLETED"} real 0m1.327s user 0m0.042s sys 0m0.007s
14 replies