Queues are not scaling up, consumer wise when using a regular (non pull consumer)
Hi there, please have a look at this architecture diagram. I am building something in CF which requires running multiple scraping browsers that process messages coming in through a queue.
My
producer
is just a Hono App that is a rest API which takes message on POST and puts them on a CF queue.
My consumer
is a plain old queue consumer (not pull style). It has a Service Binding to a 3rd worker, which invokes using RPC.
The third worker is a collection of puppeteer-scripts, which houses 1 DO for managing the connections to an outbound BaaS provider, lets call this DO library
(since you can checkout slots and return them when done, similar to books).
And then, third worker has 10 different DOs (lets call them browser-worker
DOs) for those 10 different slots, that are utilized outbound at the BaaS over websockets.
Talking below in terms of real CF deployments.
I am stumped that when pushing 3 (current batch size) messages in a row, if they happen within the batch interval, a consumer run gets triggered and concurrently processes those 3 messages across the several browser-worker DO objects.
But lets say, I am late in publishing 2 of those messages. So I publish 1, then wait the batch interval and publish 2 more. Those 2 don't get picked up until the first run of the consumer is over. Now, I scoured the docs over this and they mention that the scale-up is automatic and is based on how much message pressure there is currently in the queue.
I am launching this in 15 days and I have no way of knowing what this setup will do under pressure (without spending at my BaaS), unless I set up mock scripts that just keep busy for 60-100sec each (thats what each message can take to process). It would be good to know theoretically if I have misunderstood some aspect of this and if the Service Binding is considered busy by the runtime (when one consumer is in progress and that is the bottleneck) and that's why new consumers aren't being spun up.
Thanks for the help!4 Replies
FYI there's a limit of max 2 parallel browser
Also, from my experience, sometimes queues don't immediately fire (even though I set the treshold to be very minimal). I thought it's a bug, but after cooking and finishing my ramen, it's running
thanks! i am using Zenrows which offers 10 parallel browsers.
i thought so @TW - i could sometimes see it parallel processing but not with enough consistency. Which threshold are you referring to, just curious? batch size and interval?
thanks for the tip, but its pricing is too steep for my current load (min $69/mo)
smallest max_batch_size (1) and max_batch_timeout (0). my reasoning is those small values should trigger queues immediately (well it does, sometimes)
Continuing the conversation here: https://discord.com/channels/595317990191398933/1325281450622193696/1325556057283756078
Cooked many a ramen bowls but they simply don't scale up even with a noop implementation that takes 60s (with a setTimeout).