Very slow response time

I've had a website hosted on Railway for the past month or so - load times have occasionally run a bit slow (up to 5 seconds) but usually are consistently within 1-2 seconds which is fine. Today has been a totally different story... consistently taking 6+ seconds for the code to run (I have a run time print statement built in the code) and often the site takes far more than that to load. On top of that, sometimes the site has just gone unresponsive for several minutes at a time, and in the logs errors such as the following print: [2023-01-25 01:22:02 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:10) [2023-01-25 01:22:02 +0000] [10] [INFO] Worker exiting (pid: 10) [2023-01-25 01:22:02 +0000] [288] [INFO] Booting worker with pid: 288 [2023-01-25 01:22:33 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:288) [2023-01-25 01:22:33 +0000] [288] [INFO] Worker exiting (pid: 288) [2023-01-25 01:22:33 +0000] [320] [INFO] Booting worker with pid: 320 When I run the site locally I encounter no issues and normal response times. What could be going on here?
24 Replies
Percy
Percy2y ago
Project ID: 52b32c2c-e65f-443a-b59c-ba6a9bccf24c
Percy
Percy2y ago
Samy mentioned that responses from the servers were always < 1 second, but now it's more than 2 seconds and sometimes even more than an 5 seconds. He did a rollback to a month ago and it's still slow, so it's possible that something happened with the servers last week.
⚠️ experimental feature
Weatherman
WeathermanOP2y ago
52b32c2c-e65f-443a-b59c-ba6a9bccf24c
jackson
jackson2y ago
likely just database latencies combined with django's poor optimization of database querying. what kind of queries are you executing to cause this kinda timeout? in most cases the latency to your local database is a couple nanoseconds, whereas the latency from a railway deployment to database is more like 5-10 milliseconds so each query takes much much longer, causing unoptimized queries to have sort of an exponentially worse performance as the latency increases
Weatherman
WeathermanOP2y ago
The only db-related thing I have is a redis instance, and each website load makes max 4 calls to it. The only other calls are to weather apis which occur when the called location is not in the redis cache. Also I'm using Flask. Any ideas? This has been an on-and-off issue today. Is it possible if I upgrade to the teams plan it will mitigate the issue? Also this issue began before, but I launched a little promo yesterday and it pushed CPU to 124% (?), is that also reason to upgrade?
Adam
Adam2y ago
Where is your db located? Railway services are all on US-West.
Weatherman
WeathermanOP2y ago
I created the Redis within the project so its linked within the project environment
Adam
Adam2y ago
Gotcha Have you tested how fast the weather API responds and if it ever hangs? That could easily be the issue Could be getting rate limited
angelo
angelo2y ago
Can you link to your metrics tab? Post it here and I can check the mem and metrics.
Weatherman
WeathermanOP2y ago
I've just thrown in some time print statements thoughout the code so im gonna try to get to bottom of it that way
Weatherman
WeathermanOP2y ago
Weatherman
WeathermanOP2y ago
Weatherman
WeathermanOP2y ago
oh wait what do you mean by link
Weatherman
WeathermanOP2y ago
Railway
Railway
Railway is an infrastructure platform where you can provision infrastructure, develop with that infrastructure locally, and then deploy to the cloud.
angelo
angelo2y ago
Yep! This is it
Weatherman
WeathermanOP2y ago
Ok i’ve tested it out and it definitely has to do with the formula and not the api calls. - Basically the formula works by taking the hourly forecast and tweaking the conditions to create combinations of possible actual hour-by-hour snow cover outcomes. For example, it takes the hour-by-hour temperature array and generates 7 arrays with all temps -3, -2, -1, 0, +1, +2, +3. It then combines each array with 7 different timings. This happens with 2 more variables to create 777*8, or 2,744 snow cover arrays. - The code is also set to only go through all 4 stages under certain weather forecast conditions which is why only some locations have this issue. Without one of the variables, the number of arrays stays in the low-mid hundreds or less with a max processing time of about 3 seconds. - These are numpy arrays and each is a total length of ~30. - My computer is able to process the code in full in less than 1.5 seconds, while its taking about 7 seconds on railway. My computer is 32GB RAM while I see Railway is up to 8GB so I’m assuming what’s needed is more processing power. ---
Adam
Adam2y ago
Your memory metrics aren’t reaching anywhere close to that amount
jackson
jackson2y ago
ideally you do everything database related in 1 go, before you start post-processing data in memory
Adam
Adam2y ago
I wouldn’t recommend an upgrade to the teams plan here. The CPU power will be the same, just higher capacity. Same with memory. Have you tried multithreading? Judging by how you described the process, this sounds like something you could multithread pretty easily Doesn't look like you're doing it atm judging by your metrics
jackson
jackson2y ago
yes i believe multiprocessing is the way to go here, multithreading would be better if this was I/O bound but since this sounds like number crunching multiprocessing is the move probably
Weatherman
WeathermanOP2y ago
Good idea I'm gonna try that and I'll update, thanks
Adam
Adam2y ago
There’s a difference between multiprocessing and multithreading?? did not know that lol
jackson
jackson2y ago
multiprocessing is great for cpu intensive stuff multithreading is great for IO stuff like requests or db queries watched a very handy video on it but those are the main takeaways
Adam
Adam2y ago
Gotcha
Want results from more Discord servers?
Add your server