Concurrent connections time out above 20 connections
We have a fairly sized web-app, I can see railway allows up to 10,000 concurrent connections at a time but in load testing I cant even get over 20 connections without it severely impacting the app.
I have upgraded to the pro plan just for sanity sake but I still get the same result.
Here is the query I am running and the results.
Railway:
Results:
Railway struggles at 150 Concurrent connections and only completes 296 requests in 20 seconds and this is just on the homepage which is releteviely light
21 Replies
Project ID:
N/A
SST
SST at double the concurrent requests resolves in half the time and completes 1772 requests in the same amount of time
Just curious if I am doing something wrong here. Seems odd that sst is able to cold start loads of lambda functions in half the time as a server
you are likely hitting either a RPS or a concurrent limit (the docs are outdated) or your application can not keep up.
try again with this endpoint -
https://utilities.up.railway.app/now
That doesn't really help me load test my own instance simulating my users. That endpoint is a very simple json response. Our page makes multiple database calls and loads dozens of images.
Is there a better way to
A. Test the server configuration will handle our concurrent users. Can be as high as 3000 - 5000 depending on events
B. Simulate this to estimate monthly cost averages compaired to other platforms
this helps us determine how many RPS you can do by eliminating the (from my prospective) unknown that is your application.
once we establish this much needed baseline we can then work from there.
sure. Thanks for helping out
636 RPS that seems quite good to me?
I agree. Im just not sure why the scaling I have implemented in my server has not impacted my own instance testing.
I get the same result 1gb ram 1cpu as I do on 3gb ram, 3cpu.
In reality any more then 10 cuncurrent requests I go from a 900ms response to 3-4 seconds. Any more then 30 councurrent requests I go upto 8-10 seconds.
Beyond 100 you can see is as high as 11 seconds and it often crashes the server
okay this is good information, this is a clear indicator that there is room to improve at your application level
can you tell me more about your stack?
Interesting
I have actually made loads of improvements in the railway deployment for that reason. It makes half the requests that the sst deploymnet makes.
its Next.js and Supabase
what region is your service currently located in?
all in eu
Supabase is eu-central
all the requests to my /now endpoint where done via the asia edge proxy
I am in asia. but that shouldnt effect the time between my front end talking to my backend
yep, just something to be aware of
do you have any metrics on the round trip times for database calls?
yeah let me check honeycomb
can be as high as 3 seconds for a single call to supabase when the load test is running. A normal request sent from me loading the page is 400ms
do you think this is likely a concurrency issue with supabase? actually I know its not that since my sst test actually makes even more connections
im unfortunately not sure what the underlying cause is, but that is definitely where your bottleneck is coming from
guess I will keep trying to debug. Thanks for your help
no problem! sorry we arent able to help all that much here its just that we dont have any more observability into what your code is / is not doing than you would. but at least we found out you arent running into any RPS limits so thats good!
yeah for sure. I am loving railway, we just need to make sure it can handle our users requests at a decent cost before we switch
that's totally fair, and I want to do everything in my power to help you, and if it turns out railway doesn't work out for you, we can refund the seat cost