RunPod•14mo ago

24GB PRO availability in RO

I switched from 24GB tier in RO to 24GB PRO to benefit from the higher availability of the 4090's in RO, but most of my workers are becoming throttled again.

22 Replies

flash-singh•14mo ago

i would mix them, 4090s get relative high spikes

ashleykOP•14mo ago

I've never seen that priority thing actually working ever though Even if all my workers become throttled, it doesn't initialize the 2nd choice, they just stay throttled

flash-singh•14mo ago

what it will do is pick gpu that is available and split between the two based on availability

J.•14mo ago

just wondering, i am trying to make a new endpoint and instantly all 5 workers are throttled before initialization, so i had to add new endpoints and hopefully i can get some unthrottled to just initialize. but why would it state high avaliability if i just immediately get throttled on initialization? what does high avaliability mean then?

JM•14mo ago

@justin how long are you waiting after setting up an endpoint? Initial setup does take a decent amount of time. Also, are you using 10+ max workers?

J.•14mo ago

Been a weird situation, ive been launching endpoints but when it hits idle, and i send a request, it just starts downloading again, so ive been deleting it thinking maybe I need to wait for all my docker pushes iterations in the bg to settle down, maybe conflicting hashes are causing redownloads. https://discord.com/channels/912829806415085598/1208257003131113502 Usually i wait for about 10-20 mins in the bg right now, and see if it works, trying to solve a bug right now that is causing my worker to work on gpu pod, but somethign about it crashing on serverless. And no, im just at 3 max workers, so it spins up 5 potential workers I dont want to spin up 10+ max workers, cause i dont have enough limits to waste workers like that But yeah to answer this usually about 10-20 mins, I see if it switched to idle states from an initializing state

JM•14mo ago

@justin Use 10, I give you full permission 😊

J.•14mo ago

can i get an upgrade on worker limits at some point haha, but ok

JM•14mo ago

Personally, I like putting 10, with 1-2 active workers, for the initial setup

J.•14mo ago

i see why does that change? is it just to capture some good gpus to initialize?

JM•14mo ago

Then, send some requests, check if those processed, then if so, remove the active

J.•14mo ago

ah got it good to know huh

JM•14mo ago

Simply my own opinion of an efficient way of checking a new endpoint, I am far from being an expert though, don't get me wrong haha What's your endpoint ID? I can check it out

J.•14mo ago

AH it finally works nah its all good xD i just ended up increasing things to not just be 4090s I guess the thing i had before was i only had it on the 24 GB PRO / 4090 cause it said high avaliability and i didnt wanna run into like a out of memory but what fixed it just now for me was just extending the options

JM•14mo ago

Well, even just 24gb pro should work

J.•14mo ago

interesting

JM•14mo ago

But use more max workers, trust me!

J.•14mo ago

ok haha i guess im just running out of workesr as i deploy more 😭

JM•14mo ago

If you activate flash boot, it doesn't work very well for small max workers

J.•14mo ago

but good to know

JM•14mo ago

It gets exponentially better with more workers Give me ID, I will give you more 😂

flash-singh•14mo ago

our 4090s in eu-ro come in 2x or 3x servers, they fill up easily and cause throttle 8x servers are better but sadly 8x 4090 servers are not easy to come by

Gaming

Programming

24GB PRO availability in RO

Did you find this page helpful?