R
Railway•7mo ago
nickmacavoy

Outage?

Are you experiencing any issues? We're in Singapore. Just checking the public channel since private support aren't responding. Builds aren't working and 5 different environments are down 3c08e827-8d73-4a37-bbe9-9af9757bd354
Solution:
New reply sent from Help Station thread:
Fix implemented. Resolved.
You're seeing this because this thread has been automatically linked to the Help Station thread....
Jump to solution
25 Replies
Percy
Percy•7mo ago
Project ID: 3c08e827-8d73-4a37-bbe9-9af9757bd354
raleng
raleng•7mo ago
We have a service down as well in Singapore.
nickmacavoy
nickmacavoyOP•7mo ago
Sad state of affairs on our production infrastructure
No description
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
Same here - nothing is responding at the moment
You're seeing this because this thread has been automatically linked to the Help Station thread.
nickmacavoy
nickmacavoyOP•7mo ago
Ping
Brody
Brody•7mo ago
please check #🚨|incidents for updates
nickmacavoy
nickmacavoyOP•7mo ago
Thanks Brody! I will now that there's one there no available stackers found within resource limits on an attempted redeploy
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
Hi Nick please standby we are investigating, incident has been called
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Thanks david, adding some context where I have it in case it helps debugging
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
We came back online ~30 mins ago. Now we're back offline as of ~4 mins ago
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Our apps and services are still down as well, tried migrating to US region, no luck.
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Pls help, I can't connect to postgres db any more
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Still down for us too.
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Do you have backup, I'm thinking of migrate database to other provider
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Don't be too hasty – this should be resolved soon (given how long it took last time) though I'm not aware of your requirements. At a certain point that'd have to be an option but for us we won't as yet.
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Starting to see our services up now...
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Thanks partbot, trying to redeploy but no luck as yet. I'll also check in when we're up
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Update: Partial recovery, 50% of capacity restored. Actively working on the rest. Thanks for your patience, on-call team working as swiftly as possible to restore service.
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
thanks david
You're seeing this because this thread has been automatically linked to the Help Station thread.
jtechbit
jtechbit•7mo ago
ETA on full capacity restoration?
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
Thanks David and team
You're seeing this because this thread has been automatically linked to the Help Station thread.
jtechbit
jtechbit•7mo ago
Time for another update? Just a reminder that people have production infrastructure that is affected.
RenderCoder
RenderCoder•7mo ago
I just deployed services in the Singapore region and encountered a similar issue. Unable to deploy service successfully
No description
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
Still down, i'm trying regularly to re-deploy to no avail
You're seeing this because this thread has been automatically linked to the Help Station thread.
jtechbit
jtechbit•7mo ago
The level of communication from Railway on this incident is totally unacceptable. I hope processes can be improved as a result of the post-mortem. Even just a “we are continuing to work on it” would give some confidence an on-call team is actually working on this…
nickmacavoy
nickmacavoyOP•7mo ago
My production systems have been down 4 hours in this downtime, and in total 6 hours 15 mins today. So far
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
Update: The core issue has been identified and a resolution is in progress to restore service. The on-call team is working to roll it out.
You're seeing this because this thread has been automatically linked to the Help Station thread.
nickmacavoy
nickmacavoyOP•7mo ago
I'm online now. Redeploying worked
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
4 out of my 5 services redeployed properly. One more still haven't recovered. Might take a while more for the fix to be rolled out
You're seeing this because this thread has been automatically linked to the Help Station thread.
nickmacavoy
nickmacavoyOP•7mo ago
Almost 11pm here, going to be a nervous night's sleep given the day of issues. Thanks for getting it resolved team. Echoing jtechbit – not enough comms given the severity
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
Thanks for the feedback, acknowledged. That's on me personally for not communicating more. We've had the full on-call team on this (with several additional engineers joining) for as many hours as service has been down.
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Full service restoration in sight.
You're seeing this because this thread has been automatically linked to the Help Station thread.
Solution
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
Fix implemented. Resolved.
You're seeing this because this thread has been automatically linked to the Help Station thread.
jtechbit
jtechbit•7mo ago
Thank you for the update David! My services are now responding normally.
Duchess
Duchess•7mo ago
New reply sent from Help Station thread:
We've published a full incident retro here: https://blog.railway.app/p/2024-05-04-incident-report
You're seeing this because this thread has been automatically linked to the Help Station thread.
Railway Blog
Incident Report: May 4th, 2024
We recently experienced an outage on our platform that partially affected our Asia-Southeast compute infrastructure and caused workloads to be unreachable. When production outages occur, it is Railway’s policy to share the public details of what occurred.
Want results from more Discord servers?
Add your server