Huge sudden delay times in serverless
I'm using a webui forge serverless template for my endpoint, with a network volume attached and sometimes the results are very inconsistent. For example in the last two results you can see I use the same worker but one has a delay time of 3s and another 80.39s, the second request was submitted 4-5 seconds later after the first request, so there was no long time gap either.
I know the forge/automatic1111 templates usually take time to load but all this time up until this monday/tuesday or so it only took about 10-20 second delay time, but now I'm having 80-90 second delays. Didn't make any change in my code either. Anyone know the reason for this?
7 Replies
Delay time is the time request sitting in the queue.
yeah but if there are no requests in queue or if the previous request has been completed in 8 seconds isn't 80 seconds too much?
that depends how many workers you have available, try to set at least 2
if you're using 1 worker only, the requests would sometimes have to wait for the GPU to be available.
runpod generously loads 3 extra workers with the container image even if you've only just set 2
I actually have 3 workers
Hm... curious, what runpod SDK version are you using for the container?
I had 1.6.2 but then I switched to 1.7.2 and then even switched to an older version 1.5.3 the issue remains the same
the problem is this never happened last week or up till the middle of this week. There were occasional delays time to time but very rare
I think at this point the best is to try using a pod but that is wayyy too costly for me
ok apparently 1.6.2 gives some better results than the other two so I'll be sticking with that version
try 1.5.3, i got even better results by downgrading further