Why no gpu in canada data center today?
My network volume is in ca-mtl-1, there is no any gpu now.
Solution:Jump to solution
Hey y'all, we disable the creation of new pods four days before a maintenance to stop further issues (this was not something I was personally aware of until now otherwise it would have been posted in #🚨|incidents). However, I talked with the team and you should be able to create new pods again, let me know if you're running into any issues.
20 Replies
Read #🚨|incidents , its scheduled for maintenance thats why
@haris @Finley it's been more than 4 hours since the outage started. aren't you going to declare an incident and give some updates? looking at the green status on https://uptime.runpod.io, I suspect that your monitoring has not caught this issue.
@Madiator2011 (Work) any idea about this?
@nerdylive Its because RunPod disables the DC before maintenance is about to begin, probably because people don't read and then they log unneccessary support tickets.
Oh long before
read where?
I already mentioned this elsewhere, but @fireice being an idiot and giving me a thumbs down already proves my point.
Oof
wheres that info from btw
I know this from previous experience, Zeen or someone like that mentioned it.
Ooh like days before?
Yes
ic ic yeah thats probably it
No point in allowing someone to create a pod and have training that runs for days and gets interrupted
yeah correct hahah, but it should be on #🚨|incidents too next time when its gonna be disabled
Yeah agreed, RunPod communication is ALWAYS appalling, its about 1% better but still has a LONG way to go
just got an email response from them confirming what @digigoblin says. they are disabling new machine creations.
the morale, as there is no way to clone network volumes (correct me if I'm wrong), you better continuously make backups using https://syncthing.net/ or something like that.
Yep, I guess this is the point of the lack of communication, people need to know when a DC is going to be taken offline for maintenace a few days in advance so that they can start migrating their data to a different DC. When #🚨|incidents says its only going to be offline for maintenance on Monday, but no new pods can be created 4 days ahead of time, then its a problem because people can't access their data to make alternate arragments. @haris
I mean you should always do backup when you upload data as cloud is basically someone else computer
Solution
Hey y'all, we disable the creation of new pods four days before a maintenance to stop further issues (this was not something I was personally aware of until now otherwise it would have been posted in #🚨|incidents). However, I talked with the team and you should be able to create new pods again, let me know if you're running into any issues.
But the maintanance will be executed in the same schedule?
Yep, as far as I know but I will double check