Is there a limit in the number of threads?
I have pods with different numbers of vcpus. I am running vllm. If I create too many vllm in parallel, I get errors like "can't create thread". Is there a parameter that limits the number of threads per pod?
13 Replies
This is the error I get
Yes, it has limit, I faced same issue
I think I faced this problem at 1024 conc processes (means threads). You can always test it with thread swarming on pod
Thanks. Yeah the problem is that I get it with just 20 vllm in parallel.
What do you mean by thread swarming? Should I just spin off a number of threads to see what the limit is?
I guess that's what he meant
20 vllm in paralel? Is it 20 jobs or what
Yeah. I am running 20 vllms in different
screen
Oh how did you run it
With vllm serve
Ohh then when you hit the limit this happens?
Yes for some pod I can run like 19 in parallel and for some like 28. It is related to the number of vcpu. But I don't understand how it is related.
I would be happy to run 64 say in parallel. At some point I am hitting ram and vram limits. But that is OK.
I don't understand why I am hitting multithread limits when there is still ram and vram available.
Maybe the thread limit is related to the vcpu amount
Yeah definitely.
But like it should still multithread in time sharing. E.g. Even with 1 vcpu I should be able to get 10 threads. But here it seems that I can get max 2 threads per vcpu
Yeah I'm not sure with this, maybe you should check with a staff
@Space Burger
Escalated To Zendesk
The thread has been escalated to Zendesk!
Ticket ID: #11,317