Explanation of horizontal scale. Is it possible to autoscale?
Hi, i am writing a node js backend that will have worker thread for every user which will do endless loop work.
The nature of problem is that i dont know how many users i will get. Sometime it can be 3, or it can be 100, so i need possibility to have option to have dynamic number of threads.
Is it possible to get more threads or i have to limit lets say 10 threads per app and do the replicas? If i create 10 replicas with 10 threads, can i scale down or up if i get more users?
43 Replies
Project ID:
N/A
nodejs can run multiple threads on a single core, so i dont see any issue with threads vs cores, it would only then come down to cpu usage and for that, your app will either use as little or as much of the 32 vcpu as it needs
there is an issue how can i have 100 threads if i have 32 cpu cores?
all of the threads are blocking threads, so they can run for 10 days straight untill user stop the work
thats simply not how it works, please look into threads vs cores as that topic is outside of the scope of these forms
100 blocking threads cannot run at 32 core
it will be 32 threads at maximum running, the other will wait
yes they can, they would be interweaved
please look into threads vs cores as that topic is outside of the scope of these forms
With remote computing, threads vs cores aren't something you have direct control over. On Railway, you aren't assigned physical cores for your processes, you're assigned virtual cpus. I think of each vCPU as a single, powerful thread when planning multithreading
The nature of your app depends on how many users you can serve. Your app needs a major redesign if you're using an entire thread for a long period of time for one user
For us to help further, some specifics as to the app you're planning on hosting would be very helpful
Every thread is a bot doing endless searching on let's say 1 minute interval, and when he get items, he does something with every item sequentially and repeat that.
I guess i will try to replicate some example on my CPU and test how much it will intertwine or not if try to spawn to many workers
agree, app is in concept stage so far and i am thinking about possible problems
When you say endless searching, searching what? Why does it take that long? Can you implement multithreading into that process to speed it up?
Architecture wise, you're probably better off with two services here. One frontend to display a "please wait" message to the user, and a Backend API that does the actual searching and communicates with the front end
that way you can implement a queue system so users aren't shown a timeout page while waiting for a slot in the api
You can also host multiple replicas of that API that all communicate with one frontend to boost your capacity
Users just start and end bot. They have separate endpoints to get info about the bot's work
Ah so this is an api?
you're saying bot, discord bot?
Search is actually not time intensive, the work with browser is slow, running scripts for every item
The same point I made about apis and replicas applies here then, you can have multiple replicas running the same api to alleviate bottleneck
I am trying to understand what horizontal scale actually mean. What does it solve in my case?
If i can spawn 100 threads from the single app, what replica help with this
Multiple apis running at once that all communicate with a user through the same link
You cannot spawn 100 threads
js uses a single thread with an event-loop. In this way, Node can handle 1000s of concurrent connections without any of the traditional detriments associated with threads.
You can overload a single process with workers, but you will face worker timeout. Horizontal scaling solves this by allowing you to have multiple instances of the API that all communicate with the user through one link that can have many workers each
This isn't about connections, there is a large amount of processing needed for each request
^
same principal applies
Sure, but there's a limit to how many requests can be served at once if each is eating a large portion of the available vCPU
yes there always a limit, but it isnt necessarily tied to the vcpu count
In this case, it likely is
@lowzyyy Horizontal scaling will help you by adding 8 more vCPU available for each replica you create. A new API process will be spawned in the new container, allowing for more users to be served
Does that answer your question?
not directly its not
so limit is not 32 cpu if i use replica?
10 replica = 80cpus?
On the hobby plan the limit is 8vCPU per container
It's important for your understanding that they're virtual CPUs, not physical ones
but effectively yes
I am on hobby, but i am planning this for a company that i work for
Ah I see. In that case, each replica will give you an additional 32vCPU
as far as i understand main node js thread can spawn workers, and then main thread can take requests for some endpoints for any user, so 1 thread for MAIN and 31 thread for workers?
workers should not eat cpu time for main thread
my initial question was: if i have 10 users now, and suddenly i get 50 more all at once, how can i or can i dynamically get one replica and then shut it down when i dont need it
Or should i predefine how much in advance can i have
Because today i can server 50, and in three days i can serve 150. I feel like i should need to manually restart the service and increase replica or i if Brody is right, i might wont need another replica at all if workers can intertwine. It will be slower for sure but maybe is acceptable?
nodejs is a non blocking event loop, as mentioned, please do some research on the topic
try to compute prime numbers in main thread and tell me if its non blocking
google it if you dont belive me, as ive asked you to do many times
besides, that completely different
if i have operation that can be offloaded to c++ it will be, but if its not it will block
please do some research on the topic
i think we have misunderstanding
adam understood me i think
respectfully, the misunderstanding is on your side of things
please try to explain so i can understand. Google it up wont help me understand
what part i dont understand
as mentioned, explaining how node works is outside of the scope of these threads
simply put, how many "things" you can do at once is not directly limited by the vcpu count, and if it is, you have done something incredibly wrong.
hey sorry had to get to a meeting
it does sound like you don’t have a full understanding of the tools you are using, so it will be difficult for you to predict performance issues
I suggest you get the app hosted, test it, and address any issues you find from there
load testing is what you’ll want to do. if you’re unable to hit a high peak without performance degradation, we can discuss potential solutions
fetch new items -> open new page for every item in chrome and do some work
By the time i finish with browser it's time to fetch again new items and do the thing again, and so on
Isn't that the only way to do the thing. One worker per one user?
Only you have a full understanding of the problem you’re trying to solve
you could set up a task system such that a user sends in a request, then multiple workers work on multiple pages at once
the user’s request response time would be quicker
And you would also be able to service multiple users at once as workers can swap between users according to capacity as they complete jobs
Ultimately, it's up to your implementation
that is the problem, as far as i know, you cannot execute scripts on multiple pages. In chrome, if you switch focus to one page, script pause on the other, so i can only do sequential job, or open new chrome instane for every item which would we be very memory inneficent
User just read info from db, they just control when to start and stop the bot
Sounds like you have it all figured out then
Only thing left to do is host, load test, and see what happens
just to add a little to this discussion. (Possible solution altho i dont know your full situation)
if you have really intensive tasks to process, most systems use some kind of queue to make sure that everything works correct and doesnt run out of memory or cause any bottlenecks
In your case if its 'infinite' you could just readd the same job to the queue and it will be picked up once there is room
I’ve suggested that a few times as well, we’re on the same page