RunPod•10mo ago

Mixed Delay Times

Hey, what could be the reason for these delay times?

7 Replies

Could be various different reasons * Not enough workers to handle the number of concurrent requests, so requests sit in the queue * Cold start time (more common)

JohnDoeOP•10mo ago

I'm guessing I can't control the cold start time? I don't think workers are an issue

digigoblin•10mo ago

You can do things like enabling flash boot, increasing idle timeout, adding active workers etc to improve cold start times. FlashBoot is the only one thats free though.

JohnDoeOP•10mo ago

How does flashboot work?

digigoblin•10mo ago

"magic" https://docs.runpod.io/serverless/references/endpoint-configurations#flashboot

Endpoint configurations | RunPod Documentation

Configure your Endpoint settings to optimize performance and cost, including GPU selection, worker count, idle timeout, and advanced options like data centers, network volumes, and scaling strategies.

justin•10mo ago

Basically the tldr though from asking them is it's a caching mechanism, so the more max workers u have, the more requests, the better the cache if u have an active worker, supposedly is even faster, but i dont think is necessary, cause ive heard from people using it in prod that the flashboot is still quite fast normally even with a min worker of 0

digigoblin•10mo ago

Yeah I don't have min/active workers and flashboot works well for me pretty often but it doesn't work so well for me when I don't have a constant flow of requests.

Gaming

Programming

Mixed Delay Times

Did you find this page helpful?