Delay Time is too long
Delay Time is too long.What is the current average time for DELAY TIME?
12 Replies
Hi @jax - long delay times like that are usually from the worker needing to download a model during its cold start. If you're using a custom model you'll want to have it saved to a network volume to avoid this
Could also be that all the workers are throttled.
@Finley @ashleyk Yes, my models are all as well as downloaded during the build of docker, it should be that all workers are throttled
You should be able to see this by looking at your workers.
Yes I can see that, I'm just questioning, is that timing expected? Because it's 1 bad experience to make users wait for a long time when they are using my service.
Its expected if all your workers are throttled or you are doing downloads inside your workers etc, but we have zero visibility into that, so without screenshots, only you will know.
@Finley @ashleyk The woker has been throttled pretty badly lately
Do you only have one worker for your endpoint?
Yes, when there is only one worker the problem is very serious, after changing to 2 the situation is a little better
Well that's exactly why you shouldn't use 1 worker. When you create an endpoint, it defaults to 3 for a reason but then people change it and waste everyone's time by complaining. SIMPLY DON'T CHANGE IT!!!!!
Since I'm testing some dockers and 3 per allocation isn't enough to use them, I lowered it to 1. If 1 creates serious flow-limiting issues, I think that should be stated in the documentation
It's pretty obvious when your worker becomes throttled. They are shared between customers. Don't ever set it to 1.