R
RunPod3mo ago
NGTK

Not using cached worker

I've been running into this problem for several days now. I have a endpoint that runs a forge webui worker with a network volume attached. And as you know forge takes some time to start and only then generates the image. So generally when I send a request to a worker it takes some delay for the start process then generates images. But recently I've run into an issue where there is already a worker running with webui forge started and ready to accept requests but when I submit a new request it completely starts a new worker, which results in huge delay times. My question is, why isn't it using the already available worker which has forge loaded? And no, the requests weren't submitted one after the other so there is no reason to start a new worker
No description
13 Replies
NGTK
NGTKOP3mo ago
for anyone who needs some clarification of the image the logs are from the worker highlighted in grey color. It already has forge loaded. But as you can see there is a completely new worker running, ignoring the worker that has forge loaded
nerdylive
nerdylive3mo ago
Is it like 1 second difference or it loaded like fron a long time ago? Maybe the flashboot doesn't work everytime and keep it "warm" for a long time if your request are sporadic, or it's your endpoint scaling type that starts a new worker
NGTK
NGTKOP3mo ago
This is my endpoint scaling type, should I change anything?
No description
NGTK
NGTKOP3mo ago
a long time. For example if I submit a request at 5:00PM it loads forge and run it, and then if I submit again at about 5:03PM instead of running the request in the worker that has loaded forge it runs it on a new worker
nerdylive
nerdylive3mo ago
Yeah maybe try request count, because if a job sits in the queue for more than 4 sec while other is still processing it'll run a new one But the worker is ready to receive request or it's running processing?
NGTK
NGTKOP3mo ago
yes the worker that has forge loaded is still available, has no requests running. I can check it from the logs and the logs say it has forge loaded
nerdylive
nerdylive3mo ago
Oh then it has unloaded Or maybe Runpod's system is that way, sometimes Try setting idle timeout to 3 minutes and 30 seconds Or 4 minutes If you want to make sure that happens
NGTK
NGTKOP3mo ago
I'll try that I wanted to keep costs reduced but I don't want unavailable workers and long startup times either
NGTK
NGTKOP3mo ago
also is there a way to cancel a job if delay is too high?
No description
NGTK
NGTKOP3mo ago
currently I manually cancelled these but it would be nice if there is an automated way to do it unfortunately Enable Execution Timeout doesnt work for delay times only execution times
nerdylive
nerdylive3mo ago
Maybe from your client? Add a logic that way If you poll status then it's still in queue for x seconds just cancel on the job Or just have requests coming in
NGTK
NGTKOP3mo ago
thanks I just looked at the documentation I think I'll be able to do that btw for image generation tasks what is best - run or run_sync?
nerdylive
nerdylive3mo ago
Yup Run I guess Always run

Did you find this page helpful?