Concurrent incoming requests and scaling

I noticed that a single worker only handles 6 incoming requests concurrently, that is, until a response has been sent. This happens in wrangler dev but also in testing when deployed. Is that a limit per colo, or will more worker instances be spun up eventually? If so, how can I test that? I've had like 30 requests waiting but they're only processed 6 at a time. Secondly, when doing the same with websocket requests, the concurrency limit appears to be 2. That is, other new websocket connections are blocked until we found the right DO to accept() the connection and send the 101 response back through the worker. This seems to be severely limiting how many clients can connect quickly, because our routing to the DO is not inexpensive. Again, is that something we have to live with or will it eventually scale if there is more load?
0 Replies
No replies yetBe the first to reply to this messageJoin

Did you find this page helpful?