R
RunPod11mo ago
blistick

What does "throttled" mean?

My endpoint dashboard sometimes shows "1 Throttled" worker, and 0 other workers, except for queued ones. What does the "throttled" status mean, and how do I prevent the condition?
Solution:
From my understanding, and this is by no way official: Throttled means that other services are using the GPU. I recommend, to have at least 2 max workers (which runpod will then allocate 5 workers on your endpoint), which will have the ability to "potentially" pick up jobs with the maximum workers ever working being the amount you chose. There is no way to prevent it unless you require some "minimum" amount of working to always be active. ...
Jump to solution
6 Replies
Solution
justin
justin11mo ago
From my understanding, and this is by no way official: Throttled means that other services are using the GPU. I recommend, to have at least 2 max workers (which runpod will then allocate 5 workers on your endpoint), which will have the ability to "potentially" pick up jobs with the maximum workers ever working being the amount you chose. There is no way to prevent it unless you require some "minimum" amount of working to always be active. Throttled can also happen if there are issues with runpod itself it seems from my experience. But that is more rare. You can use the /health endpoint to always check your endpoints to make sure you have idle or active workers ready.
justin
justin11mo ago
I was in a state once, where all workers were throttled. and i was very confused - this is quite rare tho but it happens. https://discord.com/channels/912829806415085598/1187367253201657918/1187367253201657918 When I had asked about it to being very confused why everything was throttled^
ashleyk
ashleyk11mo ago
If you are using RO region with network storage, the capacity become greatly reduced since yesterday. All my 24GB workers in RO region are constantly throttled since yesterday.
blistick
blistickOP11mo ago
@justin @ashleyk Thank you both very much for this advice. To summarize it seems I should, (a) have at least 2 max workers, and (b) enable as many regions as possible for my endpoint. (@justin I followed your previous advice about improving worker startup time by NOT using a network drive (which really helped, btw) but I forgot to edit my endpoint to allow more regions.) Constant throttling is rather scary from a production standpoint.
justin
justin11mo ago
It is, unfortunately I don’t know what to do either 🥲 def a guidance that be good 1) I think can have a minimum worker to guarantee, but it costs us up-time even at a 40% discount. I know there is an update graphql endpoint but ive never known what happens if i update it to minimum of 1 worker when all is throttled
blistick
blistickOP11mo ago
Yes, official guidance would be good. Like you, I don't want to incur the cost of always active.
Want results from more Discord servers?
Add your server