Lost my GPU and forced to pay more?
Hoping someone can help asap. Started a pod yesterday, a cheaper GPU one (less than $0.5/hr). I exited it last night (to not continue to incurr costs, because it's not cler at all if you pay for non-usage), started it again this morning, and see this messsage:
"Start your pod without GPUs. This is useful for debugging non gpu-related problems or transferring data. If you have a volume configured, it will be retrieved and mounted. The price for this instance is $0.195/hour + disk costs."
It gives me a link to the docs, to this:
"Why do I have zero GPUs assigned to my Pod?
If you want to avoid this, using network volumes is the best choice. Read about it here.
Learn how to use them here.
Most of our machines have between 4 and 8 GPUs per physical machine. When you start a Pod, it is locked to a specific physical machine. If you keep it running (On-Demand), then that GPU cannot be taken from you. However, if you stop your Pod, it becomes available for a different user to rent. When you want to start your Pod again, your specific machine may be wholly occupied! In this case, we give you the option to spin up your Pod with zero GPUs so you can retain access to your data."
What is going on? Did I just waste a whole day setting things up for Runpod to act like a cheap airline?
I'm not happy at all. I need to finish a project, and I thought Runpod was a reliable service.
Anyone can suggest an alternative to Runpod that I could move to to get on with my work?
Sorry for the rant, but this is very poor business practice,
19 Replies
I don't think they can help because runpod can't just clear up other's demand on that specific pod. But you can wait yes until it clears up, or use a network storage next time and store your data In there (/workspace)
What do you think runpod should say when you stop the pod, should the website say something like it may not be resumable because of this or any ideas?
Maybe you're not forced to pay more than what's currently available, with cpu only ( the available option with 0 gpu) you can move your data into another pod, whether it uses network storage or not
No worries about the rant, any other alternatives what kind of alternative are you looking for?
Thanks for the reply.
What I don't understand is how we are supposed to work. I've lost hours of setup from yesterday, I can't show a client the service, nor can I do any improvements.
I'm new to this service, but not new to the industry. Of all the hosting and servers I've used, this is the first time I've had access to a service removed, with no ability to get it back (no GPUs available for my pod).
I guess Runpod are the server version of cheap delivery services. They look good until you have to use them for important things.
Anyway, I guess no-one here can help me. Would be great to know if anyone can recommend an alternative service with proper practices.
you can get your data back still with cpu only right?
and move it into another pod with gpu or a cloud storage
with even cloud sync button ni the website available, its easy to upload data into few cloud providers
I've just checking that My issue is that, even if I can do that now, Runpod are forcing customers to pay 24 / 7. That makes no sense when sold per hour. Otherwise I'll never know day to day if it will work
Maybe you're looking into something like AWS, with the same services as runpod, im sure they're more reliable (they have a bunch of gpu's available) in this but not sure if they have the same gpu models here, you can "stop" the instances there and still get it back any time but it charges you the same price, or you can terminate and use a EBS, or "network storage" here and terminate the instance completely
yes you're paying a price only for the storage
My issue with all this, other than hours lost yesterday setting all this up on Runpod, is that they sell it per hour, but knowing if you dont' pay 24 /7 you're screwed. That's my impression anyway.
yes, you rent a cloud storage that's what you're paying 24/7 when you dont use the instance, but you can also delete it to not get charged for anything.. just like other cloud providers
Yes, getting that now. So basically this is no good for production
what kind of service is good for production do you think?
Just for reference, when starting my pod I was told no GPUs of my type are available. But if I look at new pods, there it is. So Runpod are not being honest, again at least that is my impression as someone new to it
It's got to be reliable, and if hourly priced, not forcing the user to 24 /7 or else they lose it. If that's the case, they should play fair and charge monthly (or at least make it really clear this is just a temp service for testing).
i think its production ready, i dont get why this isn't reliable and with hourly priced it is more granular meaning if you just used for 6 minute you will just be charged for those minutes which is better than forced to pay for a month
if you don't want your data to be stored, you can just delete the pod to delete the storage
heres the explanation, if its quite hard to understand let me know maybe i can help make it simpler
Thanks, yes, makes perfect sense. Just also makes it entirely pointless for production use. I just did not know this yesterday. I'm looking for a new service. Shame, as Runpod experience was great until this morning.
It means my low-spec GPU instance would be over $300 a month with the reduced offer of committing to the month.
wait what services did you use?
(in hours), im expecting this kind of format:
secure cloud
3 hours active gpu: rtx 4090
6 hours inactive (stopped, not terminated)
storage: 120 gb
So it seems I'd need to do this on a start and stop basis, changing pods at times. I'm not sure I would want to deal with that each day., not knowing even if a GPU pod is available.
no you dont have to start / stop, but its the same as what you did stopping the pod in this case.
ifyou use a gpu pod without ns = your stuck into one physical machine as the screenshot above, or..
if you use gpu pod with ns = you can use the other physical machines and still access your data so the pool is larger
the supply is quite stable on one datacenter rather than only 1 physical machine
Thanks. WIll take a look at ns this time. I might try to just stick with a CPU, but not sure it will be enough
you can move your data into another pod, use cloud sync feature (button in website), or use like rclone, or rsync to move files to another pod via internet
To avoid running into a situation where no GPUs are available, it’s best to use a network volume to store your data and settings