RunPod•4w ago

Why no gpu in canada data center today?

My network volume is in ca-mtl-1, there is no any gpu now.
No description
Hey y'all, we disable the creation of new pods four days before a maintenance to stop further issues (this was not something I was personally aware of until now otherwise it would have been posted in #🚨|incidents). However, I talked with the team and you should be able to create new pods again, let me know if you're running into any issues.
Jump to solution
20 Replies
digigoblin•4w ago
Read #🚨|incidents , its scheduled for maintenance thats why
moez4921•4w ago
@haris @Finley it's been more than 4 hours since the outage started. aren't you going to declare an incident and give some updates? looking at the green status on https://uptime.runpod.io, I suspect that your monitoring has not caught this issue.
nerdylive•4w ago
@Madiator2011 (Work) any idea about this?
digigoblin•4w ago
@nerdylive Its because RunPod disables the DC before maintenance is about to begin, probably because people don't read and then they log unneccessary support tickets.
nerdylive•4w ago
Oh long before read where?
digigoblin•4w ago
I already mentioned this elsewhere, but @fireice being an idiot and giving me a thumbs down already proves my point.
nerdylive•4w ago
Oof wheres that info from btw
digigoblin•4w ago
I know this from previous experience, Zeen or someone like that mentioned it.
nerdylive•4w ago
Ooh like days before?
digigoblin•4w ago
nerdylive•4w ago
ic ic yeah thats probably it
digigoblin•4w ago
No point in allowing someone to create a pod and have training that runs for days and gets interrupted
nerdylive•4w ago
yeah correct hahah, but it should be on #🚨|incidents too next time when its gonna be disabled
digigoblin•4w ago
Yeah agreed, RunPod communication is ALWAYS appalling, its about 1% better but still has a LONG way to go
moez4921•4w ago
just got an email response from them confirming what @digigoblin says. they are disabling new machine creations. the morale, as there is no way to clone network volumes (correct me if I'm wrong), you better continuously make backups using https://syncthing.net/ or something like that.
digigoblin•4w ago
Yep, I guess this is the point of the lack of communication, people need to know when a DC is going to be taken offline for maintenace a few days in advance so that they can start migrating their data to a different DC. When #🚨|incidents says its only going to be offline for maintenance on Monday, but no new pods can be created 4 days ahead of time, then its a problem because people can't access their data to make alternate arragments. @haris
Madiator2011 (Work)
Madiator2011 (Work)•4w ago
I mean you should always do backup when you upload data as cloud is basically someone else computer
haris•4w ago
Hey y'all, we disable the creation of new pods four days before a maintenance to stop further issues (this was not something I was personally aware of until now otherwise it would have been posted in #🚨|incidents). However, I talked with the team and you should be able to create new pods again, let me know if you're running into any issues.
nerdylive•4w ago
But the maintanance will be executed in the same schedule?
haris•4w ago
Yep, as far as I know but I will double check