[WORKAROUND]: Eschew Railway Proxy connection via Cloudflare Tunnel
Hello everyone, we know its been a turbulent week for everyone on the platform. The Infrastructure team thanks you for your patience and your support as the Railway team has been fighting a number of network related fires on the platform.
We know that we will need to rebuild your trust in using some Railway networking products and our core product philosophy is to let you "take what you need and leave what you don't". This means allowing your use of Railway to be as modular as you need it to be to make sure that your business or workloads to be as resilient as possible.
A number of customers and the folks from the Network team validated a workaround that will let you tunnel connections via Cloudflare in case of any issues with Railway's proxy or if you want to have greater control over your networking experience on Railway.
We have published the following Template on Railway that you can deploy in your services that you can use to proxy traffic through Cloudflare.
Link here, guide is included in the template description: https://railway.app/template/cf-tunnel
59 Replies
What I'm wondering: What's the downside of bypassing railway's proxy with a solution like that ? Any technical aspects we should be aware of or for example should we be aware of increased latency (beside the one of course added due to cloudflare handling/inspecting traffic) ? Or is the railway proxy just here to make traffic handling easier for you folks ?
I'm just wondering and try to learn, it's not a real usecase for me. So take your well-owned breaks/time off and get back to this whenever someone has time π
Increased latency, essentially the Railway proxy is a sort of edge proxy. It's not a CDN, but it does handle inter-region connectivity. You'd be losing that, but we want all parts of Railway to be made optional in case if anything happens. In this case- this is a "rip cord for emergency" move.
This will help other users to protect themselves from targeted DDoSes.
as always thx @Angelo π
I wonder what the latency difference would be I feel like cloudflare normally has really good routing
Also I am confused on how to set this up. I created a railway project and I have the default nodejs thing it can create for you and I want to use cloudflare tunnel to expose it the description doesnt really make sense to me (https://discord.com/channels/713503345364697088/1202408127417421854/1202408127417421854)
It seems cloudflare tunnels actually give better latency than the railway edge proxy
ooo- would love your benchmarks of the platform
(Thanks for joining the Railway community)
I am curious if maybe cloudflare tunnels is faster for me because I am on enterprise + argo routing I should try on a free zone to find a good tool that I can do latency checks over periods of time vs running one off requests to compare at that exact time.
Either way though I think going through cf tunnels will be very valuable even if more latency so you can turn off public routing and force going through cloudflare proxy for ddos protection stuff
I wonder if I can play around with the graphql api from cloudflare to extract the latency by region in a better graph than just average across all of them. This probably gives cloudflare an unfair advantage starting in their network already though π
I await the blog post
@Angelo so are the issues still ongoing? Or should we look into this? Asking because we handle large amounts of webhooks and we started seeing silent failures quite often
We weren't sure how to trace it but this is definitely a P0 for us
I didn't even know there were issues because all of our stuff already goes through Cloudflare Tunnels π
No, but in your case, I would raise a new thread and we can debug this. With that said, Cloudfare tunnels might be a worthy alternative if requests are dropping at the network layer
Can confirm CF tunnels is great and skips out these issues
I only started up after the big incident but 24 dropped requests from the railway edge and 0 dropped with the cloudflare tunnel during the same period.
Also if you can would highly recommend turning on argo smart routing on your cloudflare zone ($5/mo + $0.10/GB). It does double the cost of your egress (at least after the 100GB included on railway) but the latency improvements are probably worth it.
Without argo there is noticeable latency difference between using the standard railway proxy and going through a cloudflare tunnel. But with argo enabled they are roughly the same average globally (railway wins in some regions and cloudflare wins in others but not too much difference)
railway_cloudflare_tunnels
and railway_cloudflare_tunnels_free_argo
ones both have argo enabled and the other 2 do not. The other difference is free plan vs enterprise plan which has minimal difference.Can't we use cloudflare (with loadbalancer? or something) to use the tunnelling (or the one with lowest latency on a specific endpoint) as a failover ? π
Are you still monitoring failure rates? I saw your message here about some failed requests still: https://discord.com/channels/713503345364697088/727689277219012669/1203140722618925086
I am yes
are you still seeing blips with cloudflare's tunnel?
Nope just the one issue that lasted 2 minutes
well they did say they fixed a blip with the private network
Yeah I noticed that on the blog I am curious what exactly that was and when it happened :Hmmge:
and the tunnel uses the private network, so if the private network blips, the tunnel blips
On the cloudflare side I couldnt really see anything obvious for what mightve happened
using the tunnel also fully closes off your service from direct public access (assuming you have removed all domains on the service)
because even if you use the cname method, an attacker can still directly access the service with host masking
Yup one of the reasons a small latency hit might still be worth it so you can utilize cloudflare for ddos protection and ensure no access outside the tunnel :fastnod:
oh so using the tunnel has a little more latency than using railways proxy network?
Depends on the region and if you enable Argo on cloudflare or not
As some examples:
In Oceania cloudflare tunnels wins with argo off and argo on.
In Western North America, cloudflare tunnels wins when you have argo enabled but the railway proxy beats cloudflare tunnels when argo is not enabled. (cloudflare tunnels did have a short blip in latency though)
In Eastern Noth America, railway proxy beats cloudflare tunnels.
very interesting, thank you
this is also to an app deployed in us west so ymmv depending on deploy region :NODDER:
Do you perhaps have any insights on the EU region, which was hit in the DDoS
Railway proxy is faster latency wise but did have 2 requests timeout in eastern europe after 5 seconds in the last 24 hours.
This is all the failures in the last 24 hours.
Railway edge proxy eastern europe had 2 http timeouts, southern africa had 2 tcp connections fail
Cloudflare tunnels without argo had 1 http timeout in india
What insights are you looking for? There was a blog post dissecting the incident posted here: https://blog.railway.app/p/2024-01-31-incident-report
Railway Blog
Incident Report: January 31st, 2024
We recently experienced an outage on our platform due to a DDoS attack that peaked at 12M requests per second. When production outages occur, it is Railwayβs policy to share the public details of what occurred.
Is there a guide specific to Railway for the Cloudflare Tunnel template?
the readme on the template doesn't seem to be Railway specific
soon β’οΈ
:Prayge:
Figured it out.. easy enough π
Seems to be much slower than the Railway proxy tho for sure.. unless I did something wrong
OHHHHH
ffs
it deployed to US-west
-_-
:LUL:
also, 6261 looks like a random port, you need to listen on a fixed port whenever you are doing internal stuff
6261 is what my Nuxt app started on
right, but it looks like a random port, the kind of random port railway would assign
oh you're saying I should specify a custom port
yes, otherwise your tunnel breaks on your next app deploy
got it π ty
ahhhh much faster now :KEKW:
So I changed my port to 3000 and now my app is down.. getting a "Application failed to respond" on my domain π
I gotta configure Railway to listen also on that port or something right?
RAILWAY_TCP_PROXY_PORT ?
since you where already listening on the auto generated
PORT
all you would have needed to do is set a PORT
service variable to 3000
and that would have simultaneously gotten your app to listen on 3000
and let railway know what port your app listens on internally so it can proxy requests to the correct portah lemme try that π
so go back to having your app listen on
PORT
you want to do that regardless if you are specifying the port in the service variables or notyep working now ty!
yo thank you for the trains
Np! I'd do more but kinda broke at the moment π :cryingman: Appreciate your assistance!
Kinda nice to have the Cloudflare Tunnel as a backup.. just a piece of mind kinda of thing.. I'll prob stick with the Railway Proxy for now.. and if anything happens I can always cut over to the tunnel π
So whats the TLDR with this issue? Does using Cloudflare DNS not work anymore?
I am confused on how one can get that conclusion from the thread and the OP, which is my apologies. This was setup as a workaround when Railway's proxy was having availibility issues with workloads, you can still use your Cloudflare DNS as normal as you have in the past.
Okay thanks for clarifying, I ask because I just switched to cloudflare nameservers today and my site doesnt load, then came here and saw an apology
Yep, that case might be tied to your cloudflare proxy settings
@Angelo do you see what needs to be changed?
You probabaly need to change the ssl settings: https://docs.railway.app/guides/public-networking#provider-specific-instructions
you're using A types, tunnel or not, an A type won't be used.
but please clarify what you're trying to do (in a new help thread), use a cloudflare tunnel or put cloudflare in front of railway