RunPod•13mo ago

All of the workers throttled even if it shows medium availability?

When we created an endpoint in a serverless manner, we noticed that none of our queries were being processed. When we looked inside the endpoint, we saw that all the workers were throttled. However, these machines appear to be available in terms of their availability status, how can we solve this?

8 Replies

DenizOP•13mo ago

It has been waiting more than 30 minutes like that?

justin•13mo ago

@Deniz This can happen if a company just eats up all gpus due to huge spikes in demand something runpod is working on btw im just a community member lol just i had also asked this the way i handle it is i set a minimum worker if im in this situation and use a /health endpoint to see if i was able to steal back the gpu i havent written code to dynamically do this yet but they have a runpod graphql where u can update ur endpoints live

flash-singh•13mo ago

have you limited this to a DC / network volume?

DenizOP•13mo ago

@flash-singh this endpoint is not limited to any network volume

flash-singh•13mo ago

I see that its better now, this is something we have to get better at moving workers when all get throttled at once.

DenizOP•13mo ago

This is very common situation right now!

DenizOP•13mo ago

Do we have any workaround for this throttling issue_

ashleyk•13mo ago

16GB GPU tier is low availability, probably better to switch to 24GB tier

Gaming

Programming

All of the workers throttled even if it shows medium availability?

Did you find this page helpful?