Looking for a really simple explanation of durable objects vs. workers

One problem I repeatedly find with CF docs is that they're not very friendly and seem to assume a lot of knowledge already. I would love to see each product overview have an intro video or something that really explains what the thing is. The docs seem to jump straight into highly technical concepts. With this in mind, what is the relationship between durable objects and workers? Right now I have a worker which powers a chatbot. Users talk to our system via a web interface and our system (i.e. the worker) responds. I'm keen to learn more of the CF ecosystem and would like to see which parts of it might fit into our stack and improve it. Are durable objects their own thing, or are they used in conjunction with workers? I know this is a very basic question, but I appreciate any help!
10 Replies
kian
kian•9mo ago
Durable Objects can only be invoked by (or via) a Worker. You can export a Durable Object from one Worker and use it from many other Workers. Workers are stateless, they need to store state somewhere (i.e KV, R2, or even DOs) whereas Durable Objects are basically stateful Workers. A given Durable Object stub represents a globally unique instance, it will never be running in two places. DO's have transactional key-value storage attached to them, and they're strongly consistent. I'd recommend looking at https://blog.cloudflare.com/durable-objects-easy-fast-correct-choose-three
Mitya
MityaOP•9mo ago
Ah OK that helps, thanks very much. So are DOs just state/storage, or can you run them by themselves i.e. they have code too? Are they always used with workers?
kian
kian•9mo ago
They can be compute, i.e used for multiplayer co-ordination. You can have a DO acting as a 'room' and have multiple WebSockets (clients) connected to it They're always used with Workers. You can have a WebSocket connection with a DO, but it is initiated via a Worker.
Mitya
MityaOP•9mo ago
Aha great, thank you. Finally, what is the advantage of using a DO for, say, rate-limiting (as per the example in the CF docs?) Why wouldn't I just apply rate-limiting logic directly within my worker, since I can access the user's IP there? As I undestand it, Workers/serverless means I don't really need to care about compute or the CPU's capability as I would with a tin box host, so what is the need for me to split that out to something else i.e. DO?
kian
kian•9mo ago
How would you keep track of the limit? Workers are per-metal, you can hit a Worker URL with cURL 10 times and hit 10 different Workers They're stateless, you have to store that counter somewhere. You can't use KV, since that's eventually consistent with a minimum 60s cache. DO's are strongly consistent and globally unique, you're not going to have race conditions or the counter going up by 5 but then later going down by 3. The distinction here is that Workers are everywhere, and there will be multiple of your Worker running globally at any given time (assuming it's being requested globally). A given DO will never be running more than once. Think of it like a game server. If clients connected to a Worker, they'd all be on their own - in super rare scenarios maybe with someone else close to them. If clients connected to a DO, they'd all be in the same server. Use DOs for games, chatrooms, storage/counters that need strong consistency, etc
Mitya
MityaOP•9mo ago
Right! That's super helpful. You should write the CF docs! Ah yes, of course, Workers are stateless so my rate limiter within a worker idea couldn't work - unless I stored this info against IPs in a DB, or perhaps via async local storage. But you've answered my question about what this is all about, so thanks very much 🙂
kian
kian•9mo ago
ALS would be per Worker too To visualise it, do curl https://cloudflare.com/cdn-cgi/trace | grep 'fl=' Each unique value you see is a different Worker. If you make a limit of 10 requests per minute, and I hit 20 different Workers, I could actually make 200 requests in a minute. If you used DOs, it'd be more like...
const id = env.LIMITER.idFromString(clientIp);
const stub = env.LIMITER.get(id);

const limited = env.LIMITER.check();
const id = env.LIMITER.idFromString(clientIp);
const stub = env.LIMITER.get(id);

const limited = env.LIMITER.check();
That creates a Durable Object instance, unique to that client IP, close to where the Worker has been invoked. For the same clientIp, that will never go to anywhere but that one Durable Object. It's globally unique. Even if you had 100 Workers globally being hit by that IP somehow, they would all go talk to this Durable Object. Since it's created close to the Worker that first called get(...); on it, you don't have the issue of 'always check the DB in <region>' If your DB is in the US, that's fine for clients in the US. What if I'm in Europe, or Asia? Now every request is slow since you're going to the US to see if I'm limited. WIth my unique DO, for my IP, created close to me - it's fast. fwiw, this is just to explain the concept - in reality you'd use the rate limit binding https://developers.cloudflare.com/workers/runtime-apis/bindings/rate-limit/
Mitya
MityaOP•9mo ago
Thanks for this detailed explanation, that helps a great deal. I get now why you wouldn't control this in the worker itself as there could be lots of workers and they are stateless. One final thing I'm a bit hazy on: from what you're saying, a DO sounds like an closed sandboxed state per user. Is there a global state? Suppose (ridiculously) that I wanted to shut down my app after the first 100 requests from all users - not per user. Could I use a DO for this? It would need to be aware of, and react to, the number of total requests, not per user. Hope that makes any sort of sense.
kian
kian•9mo ago
DO is "per whatever-you-want-them-to-be". You can replace clientIp with appName and count based on that if you wanted.
jason
jason•9mo ago
@kian What's the advantage of the rate limit API over creating my own durable object for rate limiting? I wrote my own before learning today there's a rate limiter API. lol edit: It looks like my own gives me flexibility to set the time period as desired. In my case, I want a 15min period, but the rate limiter API only offers either 10sec or 60sec, it seems. So I'll probably keep what I have. Oh...and the built-in rate limiter is only for a particular location, not global. Ok, I think I mostly answered it and it's too limited for my use case (i.e. rate limiting per user globally, 15min period).

Did you find this page helpful?