RunPod•14mo ago

Extremely slow Delay Time

We are using 2 serverless endpoints on runpod and the "Delay Time" (which I assume measures end to end time) varies drastically between the endpoints. They both use the same hardware (the A5000 option) and one of them has sub-second delay times and the other ~50 seconds up to 180s. On the slow endpoint, the worst cold start time is reported as 13s, and the execution time is ~2s, which don't add up to the delay time. There are ~50 seconds unnacounted for. The other endpoint using the same hardware does not observe such drastic delay time.

Solution:

Delay time is NOT end to end time. It is the cold start time + the time that your request is in the queue for before a worker picks it up. Delay time can be dramatically impacted if all of your workers are throttled.

Jump to solution

5 Replies

wmuteOP•14mo ago

My question would be: how is the delay time measured? is our bad timing due to throttling, or do we not have enough workers to handle our traffic?

Solution

ashleyk•14mo ago

You can improve slow delay time by not using Network Storage on your endpoint, or select a GPU tier that doesn't have low availability.

ssssteven•14mo ago

I found that term delay very confusing. Shouldn't it just mean the time spent before the handler is called?

ashleyk•14mo ago

Not sure whats confusing about it, thats exactly what delay means

Gaming

Programming

Extremely slow Delay Time

Did you find this page helpful?