Is there a concurrent connection limit for Railway services?
I am currently hosting https://docs.soketi.app/ on Railway and attempting to run a load test to ensure it will handle my production traffic (20k+ connections) before swapping over. Whenever I attempt to go over ~3,000 concurrent websocket connections, the server stops accepting new connections. At this point I'm fairly confident something at Railway is the issue but I'm not sure what.
- This happens when connecting from both my local machine and a rented Vultr VPS
- It happens if I try 3k+ on one process, or split it across several processes
- It happens if I use the up.railway.app domain or a custom domain
- It happens if I proxy through cloudflare or go straight to railway
- It happens with 1 replica and with 2 replicas
- It also happens on a different service (https://github.com/edgurgel/poxa)
Project ID is 335171eb-cc79-4e25-9274-12e40bb58b0e
34 Replies
Project ID:
335171eb-cc79-4e25-9274-12e40bb58b0e
how many concurrent connections can you do if you hosted this app on your vps?
The VPS was just to ensure it wasn't my desktop/isp causing the issue, but from my desktop -> VPS I was able to do 10k connections without much trouble
then you are likely hitting a limit for concurrent connections, are you on pro?
I'm on hobby right now. I'd be willing to purchase pro but I'm a little concerned that the connection limit doesn't seem to be documented, and doesn't seem to increase with replica count
this would be a limit on the incoming traffic, nothing to do with replicas
totally understand the not wanting to buy pro, since you don’t know if it would increase the connection limit, and the answer to that it, no it likely wouldn’t, i only asked because you would need to be on pro for railway to then raise those limits. i will ask the team during a workday if the connection limit can be increased for pro users and get back to you with an answer
gotcha, thanks so much!
no problem
hey @Fugi bit off-topic sorry, but can I ask you why you using soketi and not n8n for example? Have you tried it?
As far as I can tell n8n is a self-hosted zapier, not a self-hosted pusher, so it doesn't match my needs. I have not used it. Regardless the choice of service doesn't matter if users can't connect to it.
Hey there Fugi, can you give us more imformation?
What is the streaming usecase that you are looking to solve?
Hey Angelo! I currently run a site (https://reactive.fugi.tech/) that I'm looking to move to Railway. It has 600k users, with 20k+ active at any given time. Users are able to upload images and set configuration settings, then embed a browser source in OBS to render those images based on who is in a Discord call with them.
Every browser source opens a websocket connection, and whenever the config/image changes I send a websocket message telling them to pull the new settings. So there's very high concurrent connections (20k+) but very low traffic over the connections.
So I work on a
ws://
based app and websockets on Railway behave a bit too magical for my own taste. So we are doing a few things to help with this.
1. We are going to be shipping network and proxy logs so you can see the termination of the connection within a certain domain due to timeout or limit.
2. We are going to shipping UDP for that sweet, sweet, HTTP/3 based network.
3. We have some fun eBPF based network observability coming so you can see open connections right within the canvas.
We have an influx of real-time apps (not counting my own) so it's something that is top of mind.
Anyway- about that connection limit, I don't think so but can you tell me a bit about the ws libs you are using (in the meantime, I am going to look over your project.)More logging, metrics and HTTP3 is very exciting! As for what library, I'm trying to host Soketi which is a pre-made OSS pusher clone. Internally I believe it uses node + uWS (https://github.com/uNetworking/uWebSockets).
I am blind- I see that ;-;
I am going to ask internally to see if it has anything to do with our current proxy rules.
There is no reason why we should be blocking that.
I believe other Railway users are hosting Soketi (I see chatter about it in this discord) but if it's a problem I'm happy to write my own Rust + Warp server
One fear I have is that our proxy isn't giving up the connection when it should.
Fear, but it will need investigation.
Anyway- you got a REALLY cool product, and my bias is that I want you on the platform. I am going to escalate this up so we get to the bottom of it.
Thanks so much! If it is helpful this is the exact load test script I've been running:
Perfect, and for your existing deployment footprint, who are you using?
Just wondering to see what machine size we should expect.
(although- I think this is all proxy)
Currently I have several apps all running on a Vultr VPS with 4 cpus and 16GB of RAM, but it's typically only using a fraction of those resources
Sweet- I am going to destroy our network. Is it cool if you give me like 12 hours to get to the bottom of this?
(Need to knock-out some todos.)
yep, absolutely! I haven't moved anything over yet, this is just some of the final load testing before I attempt the swap :)
Hey @Angelo just following up to see if there are any updates or revised timelines :)
I just fired off my last email for the day- I am now hacking away in-between passing candy out.
Thanks!
Okay @Fugi (sorry for tag) - our proxy is indeed doing a fucky wucky
Raising it to the network team, whats your migration timeline?
I don't have a strict timeline, whenever it's fixed I'll move over. Ideally within the next week? If you expect it to be a lengthy fix, I've also been considering hosting the websocket server elsewhere temporarily and moving the non-websocket parts over.
bumping this for sanity check
what's the connection limit for websockets / server sent event connections for PRO? 3000?
they have very recently updated their documentation
https://docs.railway.app/reference/public-networking#domain-rate-limits
is the connection limit restricted per team or per app?
say if the connection limit is 3000 and we're looking at 4000 connections at peak was thinking of running 2 instances of the app
I assume it would be per client IP
Per app, so you can bump that replica number up.
unfortunately that wasn't the case for fugi, has this behaviour changed?
Not that I recall, but I was on a call where they reported that bumping the instance count did serve to help. Might have been an application issue, but it matches with my impulse of: when in doubt, throw more compute at it.
But to Costin's question, the IP connection limit isn't a user setting, its an app setting.