RunPod•5mo ago

Not using cached worker

I've been running into this problem for several days now. I have a endpoint that runs a forge webui worker with a network volume attached. And as you know forge takes some time to start and only then generates the image. So generally when I send a request to a worker it takes some delay for the start process then generates images. But recently I've run into an issue where there is already a worker running with webui forge started and ready to accept requests but when I submit a new request it completely starts a new worker, which results in huge delay times. My question is, why isn't it using the already available worker which has forge loaded? And no, the requests weren't submitted one after the other so there is no reason to start a new worker

13 Replies

NGTKOP•5mo ago

for anyone who needs some clarification of the image the logs are from the worker highlighted in grey color. It already has forge loaded. But as you can see there is a completely new worker running, ignoring the worker that has forge loaded

Jason•5mo ago

Is it like 1 second difference or it loaded like fron a long time ago? Maybe the flashboot doesn't work everytime and keep it "warm" for a long time if your request are sporadic, or it's your endpoint scaling type that starts a new worker

NGTKOP•5mo ago

This is my endpoint scaling type, should I change anything?

NGTKOP•5mo ago

a long time. For example if I submit a request at 5:00PM it loads forge and run it, and then if I submit again at about 5:03PM instead of running the request in the worker that has loaded forge it runs it on a new worker

Jason•5mo ago

Yeah maybe try request count, because if a job sits in the queue for more than 4 sec while other is still processing it'll run a new one But the worker is ready to receive request or it's running processing?

NGTKOP•5mo ago

yes the worker that has forge loaded is still available, has no requests running. I can check it from the logs and the logs say it has forge loaded

Jason•5mo ago

Oh then it has unloaded Or maybe Runpod's system is that way, sometimes Try setting idle timeout to 3 minutes and 30 seconds Or 4 minutes If you want to make sure that happens

NGTKOP•5mo ago

I'll try that I wanted to keep costs reduced but I don't want unavailable workers and long startup times either

NGTKOP•5mo ago

also is there a way to cancel a job if delay is too high?

NGTKOP•5mo ago

currently I manually cancelled these but it would be nice if there is an automated way to do it unfortunately Enable Execution Timeout doesnt work for delay times only execution times

Jason•5mo ago

Maybe from your client? Add a logic that way If you poll status then it's still in queue for x seconds just cancel on the job Or just have requests coming in

NGTKOP•5mo ago

thanks I just looked at the documentation I think I'll be able to do that btw for image generation tasks what is best - run or run_sync?

Jason•5mo ago

Yup Run I guess Always run

Gaming

Programming

Not using cached worker

Did you find this page helpful?