Is runpod UI accurate when saying all workers are throttled?
To be honest, I cannot tell if the image I see is correct? I have two endpoints both with max 3 workers, and saying every GPU is throttled? I can't test right now, but why would it fall into this state / is it accurate?
Worker Ids:
ugv9p9kcxlmu1c
5snyuonk8vkisq
Hopefully falls out later when I can test it, but it just makes me wonder, if I send a request when it says this, will the GPUs be unthrottled? or? Is this expected behavior that can occur?
Is it if I send a request, will my GPUs will be pushed higher in priority / they get throttled when not in use?
Just trying to understand this so I don't start sending requests one day and find all my gpus and throttled.
4 Replies
when all gpus are throttled, your request will sit in queue and a worker will start as soon as it becomes available
I see interesting - if I keep a min of one worker is this the best way to counter it in prod?
yes
Thank u! Perfect