R
Railway9mo ago
Tista

Experiencing Intermittent Network Issues

Hi Railway, our deployments are experiencing intermittent network issues when connecting to different services deployed also within Railway. It doesn't matter if we call other services either through their public hostname or the private hostname. For the same request, we sometimes get a 200 and sometimes it's a 503. This is making our deployments being highly unstable, we've been getting numerous monitoring downtimes because of this. I screenshot our resource usages, it's very low. Here are our projects: https://railway.app/project/56943544-81c5-486d-8741-d8cca3f88ed1 https://railway.app/project/2f705f5a-06d6-45ac-871e-f5b0a7690fa7 https://railway.app/project/2119123b-28c6-4ba9-97e5-75be1b52dcc1 https://railway.app/project/c9986c4d-72c9-46ad-9186-160d3e9c1d44 Would this be relatable to the new TCP proxy upgrade you guys did recently? Any help is appreciated, thank you.
No description
No description
17 Replies
Percy
Percy9mo ago
Project ID: 56943544-81c5-486d-8741-d8cca3f88ed1,2f705f5a-06d6-45ac-871e-f5b0a7690fa7,2119123b-28c6-4ba9-97e5-75be1b52dcc1,c9986c4d-72c9-46ad-9186-160d3e9c1d44
Brody
Brody9mo ago
i too got random 503's calling other railway services from within railway earlier today getting the team involved
angelo
angelo9mo ago
Hey there @Tista - the infra team is conducting an investigation, can you provide some timestamps for us to narrow the problem down?
Duchess
Duchess9mo ago
New reply sent from Help Station thread:
Hi I have a frontend on vercel and my backend hosted on railway is also throwing a 503. It just says application failed to respond and I do not see anything in the logs!
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Hi I have a frontend on vercel and my backend hosted on railway is also throwing a 503. It just says application failed to respond and I do not see anything in the logs! Can you also provide additional information like timestamps and project-ids?
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
@angelo all my requests are 503 so the server is completely down. It was working fine about 30 minutes ago.
You're seeing this because this thread has been automatically linked to the Help Station thread.
Tista
TistaOP9mo ago
Hey Angelo, thank you for replying. I'm on UTC +4, we've been experiencing this since 4:54AM this morning (March 8) and it's still ongoing, we're still receiving random 503s. Would appreciate the root cause through your investigation, thank you.
Tista
TistaOP9mo ago
Hey Railway, this is also happening in your docs site
No description
angelo
angelo9mo ago
Yep- updating, we have found the source of the affected resources, can you trigger a redeploy for your services? This will land your workload on a different resource.
Tista
TistaOP9mo ago
All right, will redeploy now
angelo
angelo9mo ago
Checking in.
Duchess
Duchess9mo ago
New reply sent from Help Station thread:
Redploy failed for me Container failed to start ========================= We failed to create a container for this image.
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
@angelo Second redeploy fixed it. Do we know what the issue was?
You're seeing this because this thread has been automatically linked to the Help Station thread. New reply sent from Help Station thread:
Going to leave the final investigation for the Infra team as they address and fix the issue, glad you are resolved for now.
You're seeing this because this thread has been automatically linked to the Help Station thread.
Tista
TistaOP9mo ago
We've restarted deployments in all of our projects, still monitoring for 503s
Brody
Brody9mo ago
have you restarted the deployments that your services are making requests to?
Tista
TistaOP9mo ago
Yes we restarted each and every one of our services
andrzej | t2
andrzej | t29mo ago
I've got a question related to this issue - when was the underlying issue introduced? We've experienced same issues yesterday 7PM UTC, and after restarting the service that was unavailable, it worked fine for couple of hours. Today we're experiencing similar issues, as described above, and again - redeplyoment worked for the time being
Brody
Brody9mo ago
according to my logs the first error appeared 2024-03-07T07:50:01.818978082Z UTC aka March 7th 7:50AM UTC hey @Tista @andrzej | t2.world the incident has now been resolved, you can read about the reasoning here https://discord.com/channels/713503345364697088/846875565357006878/1215585864286081034 if you are still experiencing this issue please do another set of redeploys.
Tista
TistaOP9mo ago
Yeah i’ve read it, thanks for the quick turn around. I can confirm we’re no longer getting 503s.
Brody
Brody9mo ago
happy to hear that!
Want results from more Discord servers?
Add your server