R
RunPod10mo ago
jax

Delay Time is too long

Delay Time is too long.What is the current average time for DELAY TIME?
No description
12 Replies
Finley
Finley10mo ago
Hi @jax - long delay times like that are usually from the worker needing to download a model during its cold start. If you're using a custom model you'll want to have it saved to a network volume to avoid this
ashleyk
ashleyk10mo ago
Could also be that all the workers are throttled.
jax
jaxOP10mo ago
@Finley @ashleyk Yes, my models are all as well as downloaded during the build of docker, it should be that all workers are throttled
ashleyk
ashleyk10mo ago
You should be able to see this by looking at your workers.
jax
jaxOP10mo ago
Yes I can see that, I'm just questioning, is that timing expected? Because it's 1 bad experience to make users wait for a long time when they are using my service.
ashleyk
ashleyk10mo ago
Its expected if all your workers are throttled or you are doing downloads inside your workers etc, but we have zero visibility into that, so without screenshots, only you will know.
jax
jaxOP10mo ago
@Finley @ashleyk The woker has been throttled pretty badly lately
ashleyk
ashleyk10mo ago
Do you only have one worker for your endpoint?
jax
jaxOP10mo ago
Yes, when there is only one worker the problem is very serious, after changing to 2 the situation is a little better
ashleyk
ashleyk10mo ago
Well that's exactly why you shouldn't use 1 worker. When you create an endpoint, it defaults to 3 for a reason but then people change it and waste everyone's time by complaining. SIMPLY DON'T CHANGE IT!!!!!
jax
jaxOP10mo ago
Since I'm testing some dockers and 3 per allocation isn't enough to use them, I lowered it to 1. If 1 creates serious flow-limiting issues, I think that should be stated in the documentation
ashleyk
ashleyk10mo ago
It's pretty obvious when your worker becomes throttled. They are shared between customers. Don't ever set it to 1.
Want results from more Discord servers?
Add your server