char8
Did something related to private networking fundamentally change in the past couple of days?
We're working on a privnets v2 - rewrite of the BPF programs underneath, more control over injection, and a bunch of other improvements to the overall runtime. It's been a slow process to rollout because it's complex and has a wide impact, it's not available to users atm - we're slowly landing it in parts - I'll ping back here once that's launched (don't want to commit to timelines until I get it out to Beta), but I'm actively on that 100% of my time.
268 replies
Did something related to private networking fundamentally change in the past couple of days?
Heya - so the issue here seems to be a very narrow race between when there's a high volume of traffic while we are initialising privnets. There was a gap between two syscalls in that init process where a packet of traffic could get the IPv6 neighbour table into a bad state.
We only spotted this yesterday cos' we built a new chaos monkey system to stress test with frequent deploys and monitor network connectivity. The race existed since we launched the feature, and no users previously reported it. It's unlikely it was a regression, and more probably that you ran into it by accident. Most probably because the containers crashed and restarted into a high volume of traffic; redeploys etc... get a gradual traffic ramp so it's very unlikey to trigger.
268 replies
Some logs never load
there was this:
https://discord.com/channels/713503345364697088/1196872957746745375/1196909111950974986
129 replies
Unable to connect to newly migrated MySQL service from outside the Railway network
Hi - super apologies about that - it ended up being a very gnarly bug. I managed to clear the route caches early so stuff should've resumed about 50 mins ago. But we've rolled out a proper fix and are going to close out the incident as soon as we've done some more test.s
7 replies
Unable to establish postgres connection during deployment
@Hwoarang this should be fixed for you now - sorry for the inconvinence. There was a bug that caused us to copy deleted vols when you fork an env, and your DB was trying to attach to the old (non existant) vol. We've now fixed the underlying issue and also deleted those orphan volumes from your environment. Re-deployed and pg is back.
17 replies
Unable to establish postgres connection during deployment
we're looking into this now.
Just a quick Q' - the canvas shows the volume unmounted for the postgres and redis, is that something you tried to do manually? or is that something that happened when the deploy failed?
17 replies
Is my service down?
you can create as many Redis nodes as you like
https://docs.railway.app/guides/redis
we can't really help with application logic
but yeah, that reads as only 49/5468871 keys have expiry set
256 replies
Is my service down?
the Redis has been up for 36 hours from what i see
If some jobs were running and others were not, that's probably something internal in your app, since Railway can't do anything to cause a partial failure inside your app like that. I'd check if your app is able to correctly recover from broken redis connections, etc... and try to add more logging to see why things fail.
Incase you identify that this happens because of redis connection failures, you should implement a retry - if you're not using private networking to talk to your redis, consider switching, that'll give you a cheaper and more performant network path between your app and Redis.
Please see: https://docs.railway.app/reference/private-networking#caveats as private networking becomes available shortly after your app starts, so it may need to retry if it can't connect to the redis initially. Would try this out on a staging env before you switchover production.
256 replies
Is my service down?
Re: PR environments,
can you delete them from https://railway.app/project/dd204693-57d8-4d8e-afd2-d01235ff028f/settings/environments
we fixed an issue on that page this morning and the envs should now be visible there
256 replies