Async Processing
so our users are connected to our worker indefinitely while the article is created
14 Replies
So let's break this apart:
1. You have clients blocking (HTTP? WebSocket?) while the background job processes
2. Some jobs take > 15 minutes to process, over several HTTP calls. This means that each consumer is effectively tied up while it's working on that prompt. This attaches the number of users you have to the number of consumers you need - so even if we could increase to say 20 or 30, you'll quickly run into the same problem at even (just) 2-3x volume, because it sounds like it scales linearly. Worse, if new users ask for more work to be done.
3. You're (somehow?) snapshotting/checkpointing progress per job?
no, #1 is not correct
this is for our new background processing feature
there is no HTTP request
if there was an HTTP request there would be no problem because the worker can stay alive indefinitely...
that is what we are already doing with KoalaWriter right now in production
I'm having to make assumptions 🙂
Can you clarify the entire flow please?
API request is made which returns immediately and submits to an internal queue I made on top of DO. then that eventually gets submitted to CF Queues which gets consumed to actually write the article
CF Queues sold itself on being horizontally scalable so I am surprised that scaling to even just 20 consumers is a problem
if I created a server that simply called Workers and kept the connection open then that could stay open indefinitely AND horizontally scale... why can't Queues just do that?
We're in beta, so hold tight 🙂
Relying on holding connections open indefinitely is neither reliable or scalable though, which is my point.
understood, I know it is still early
which is part of why we are using queues, so we don't have to hold connections open
Right, so back to my previous point: are you expecting infinite scaling?
well, with workers we are getting infinite scaling already. we have hundreds of concurrent users writing articles right now and there is zero problem if it is open for 15+ minutes. the problem has just popped up when we are trying to do background processing
you could have 100 users connected to workers on 1 hour long http requests with no problem
so I was surprised that you could only have a max of 10 concurrent queues
the scalability of Workers vs Queues is very, very different right now which just surprised me
I already built a more scalable queue with Durable Objects. the only reason I am using CF Queues is because of the 30 second limit in waitUntil when the request is closed when calling a Worker
we could break up each part of the article creation into different steps and submit each individual part to the queue. that would increase the cost and time quite a bit to get around the 15 minute limit. but then we would still get hit with the 10 concurrent consumer limit. since it is built on top of Workers I am surprised that it isn't more horizontally scalable since that is already built in to workers
maybe I can just do the background task in my DO queue itself. it doesn't appear to have runtime limits when called from an alarm?
alright just removed CF Queues and just going to do the task processing in the queue I built on top of DO. I'm still confused by the massive scalability difference between Queues and other CF products. I have come to expect a certain level of scalability with CF products and it is honestly really surprising to me that you think more than 10 consumers is some sort of edge case
We don't think it's an edge case, and I didn't say that 🙂
- There's a difference between a HTTP Worker and a Queue consumer — a Queue consumer can run for a lot longer and is processing (often) batches of messages.
- If you need your queue consumer to scale infinitely — e.g. matching 1:1 the number of "active users" (those requesting articles) of your app — then there might be a mismatch between what you're expecting from Queues (at least for now).
- We're in beta, and so there's still work for us to do in order to scale out, but I will be transparent and "infinite consumers" is just a very different design goal. A good comparison is SQS: up to ~1000 consumers, but with a way to limit that for cost/backpressure reasons.
There's a difference between a HTTP Worker and a Queue consumer — a Queue consumer can run for a lot longer and is processing (often) batches of messages.but this is the part I don't understand. today, in production, we have hundreds of concurrent, long running HTTP requests to Workers. saying "a Queue consumer can run for a lot longer" is simply false since a Worker can run indefinitely...
A Worker cannot run indefinitely: there are CPU limits (and still overall network timeout limits!). Most HTTP Workers are invoked for milliseconds at best; Queue consumers are often running for far longer, consuming far more CPU.
yes, CPU limits, but there are not limits on the overall runtime of a Worker while the client is still connected to it 🙂
At the end of the day, the ability to run (many) more concurrent consumers will happen, but I'm also being clear with you now in that "infinite" is not something on the medium term roadmap for Queues, so that you can find an alternative approach (and I think DO is the right fit here, as I called out earlier in the conversation)
makes sense. I did already switch the task processing to run directly on the queue I built on DO
the only sticking point now is it gets evicted when the worker code is updated
is the main issue with queue consumer scalability due to them needing to be allocated more cpu or something like that? if so, an option to have a lower cpu limit could make sense for those tasks that are just IO bound like a typical worker. no idea if that would make sense, it was just a thought I had