Serverless GPU low capacity
I'm finding it almost impossible to use the serverless endpoints as there are no GPUs available, I have a network volume in Romania so therefore need GPUs in the same region. It spends ages throttling "throttled: Waiting for GPU to become available.", then when eventually one comes online it goes off again soon after even if 'Idle timeout' is set to an hour.
Is this a common state of just unusually busy right now.
Does RunPod have plans to increase capacity, considering its in such large demand.
5 Replies
RunPod has plans to increase capacity but it takes time. I created a new network volume in a different region and a new endpoint because all my workers in RO region also became throttled due to low availability.
can you copy a volume from one region to another?
No, you have to mount each to a pod and then transfer the data from one to another via a pod. I sync all my models to Hugging Face hub so that I can keep all my network volumes for my endpoints in sync.
if i was to bake the models into an image, would that mean i wouldn't need to use a network volume, and therefore the workers could load up in any region where there is available capacity? i can't find any definitive info on the benefits of volumes vs baked images
Yes, this is correct. Using networtk storage binds you to a region.