R
RunPod7mo ago
JohnDoe

Mixed Delay Times

Hey, what could be the reason for these delay times?
No description
No description
7 Replies
digigoblin
digigoblin7mo ago
Could be various different reasons * Not enough workers to handle the number of concurrent requests, so requests sit in the queue * Cold start time (more common)
JohnDoe
JohnDoeOP7mo ago
I'm guessing I can't control the cold start time? I don't think workers are an issue
digigoblin
digigoblin7mo ago
You can do things like enabling flash boot, increasing idle timeout, adding active workers etc to improve cold start times. FlashBoot is the only one thats free though.
JohnDoe
JohnDoeOP7mo ago
How does flashboot work?
digigoblin
digigoblin7mo ago
Endpoint configurations | RunPod Documentation
Configure your Endpoint settings to optimize performance and cost, including GPU selection, worker count, idle timeout, and advanced options like data centers, network volumes, and scaling strategies.
justin
justin7mo ago
Basically the tldr though from asking them is it's a caching mechanism, so the more max workers u have, the more requests, the better the cache if u have an active worker, supposedly is even faster, but i dont think is necessary, cause ive heard from people using it in prod that the flashboot is still quite fast normally even with a min worker of 0
digigoblin
digigoblin7mo ago
Yeah I don't have min/active workers and flashboot works well for me pretty often but it doesn't work so well for me when I don't have a constant flow of requests.
Want results from more Discord servers?
Add your server