Random 502 Reponse – Nothing is crashing

Greetings Railway Team! We're using Vercel as the frontend host and Railway as the backend host. Vercel communicates with Railway via reverse proxy. Once in a while we're catching "502 Bad Gateway" errors on our frontend that affect user's experience [image 1]. On the backend however, we see no logs corresponding to any crash whatsoever [image 2]. When I check the CPU Usage, I see those weird spikes [image 3], as well as spikes in Network Egress [image 4] Memory Usage also spikes a little, but seems ok [image 5]. We've recently made many changes on the backend side, but also changed our DNS provider from GoDaddy to Cloudflare – with proxy disabled. Do you think it's strictly application's issue? I find that to be weird since it would crash otherwise. Once again, this happens quite rarely – but still happens. What do you think? Thank you.
No description
No description
No description
No description
No description
Solution:
Hello, yes this is strictly an application level issue, it looks like your application may not be able to handle every request and thus doesn't answer some resulting in a 502, fortunately the solution is simple, add another replica or two!
Jump to solution
11 Replies
Percy
Percy2mo ago
Project ID: N/A
dalechyn
dalechynOP2mo ago
3e6a2b9c-4e34-41f8-979a-b83b27f3198d
Solution
Brody
Brody2mo ago
Hello, yes this is strictly an application level issue, it looks like your application may not be able to handle every request and thus doesn't answer some resulting in a 502, fortunately the solution is simple, add another replica or two!
dalechyn
dalechynOP2mo ago
Thank you, will consider looking deeper into what could be causing this – resources aren't even close to the limit so don't think replicas will solve the issue.
Brody
Brody2mo ago
you're right, resources aren't the bottleneck here, the bottleneck would be your applications availability to use the available resources, so yes replicas aren't going to fix the root issue they are going to mitigate the 502s since the load on any individual replica will be lower
dalechyn
dalechynOP2mo ago
thank you. consider this ticket to be closed!
Brody
Brody2mo ago
sounds good! happy to help
dalechyn
dalechynOP2mo ago
Hi, wanted to clarify one detail – Can it be theoretically possible that during the vCPU scaling a process might be not responsive the same way as it's not responsive when the app resumes from a sleep? I looked through 502 error support requests and found out that's it's a known issue that's not fixed – essentially it's scaling a container from 0vCPU to minimum 0.1vCPU. Can the same issue persist while scaling 0.1vCPU to 1vCPU in example? We only see 502 errors when scale happens, works good after
Brody
Brody2mo ago
unless you have app sleeping enabled, nope, as long as your app is running there is no scaling happening
dalechyn
dalechynOP2mo ago
it doesn't have sleeping enabled – but I can inspect that vCPU usage goes to 0
No description
Brody
Brody2mo ago
yep programs can use 0% cpu utilization
Want results from more Discord servers?
Add your server