How many workers are running?

Is there a way to see how many worker instances are running for my site? I'm running an in-memory cache, and just trying to estimate what the cache hit rate will be.
6 Replies
Hello, I’m Allie!
There isn't a way to count Worker instances. We do generally recommend not bothering with in-memory caches though, because unless you are caching between request from a single app on a single device, the odds of many requests hitting the same Worker are decently low
Chaika
Chaika8mo ago
Workers run on every Cloudflare Machine/metal, and there's no particular affinity to a specific one other then what Keepalive does (which ties a single user to a single machine for the duration of that session). Users will naturally just split out among many metals
tskuzzy
tskuzzyOP8mo ago
gotcha thanks. I assume there's some limit to how many machines its scaled out to (depending on load), where in-memory caching will still help for certain workloads for example, I noticed unkey uses tiered caching, where the bottom tier uses in-memory: https://github.com/unkeyed/unkey/blob/main/packages/cache/src/stores/memory.ts
GitHub
unkey/packages/cache/src/stores/memory.ts at main · unkeyed/unkey
Open source API management platform. Contribute to unkeyed/unkey development by creating an account on GitHub.
Chaika
Chaika8mo ago
. I assume there's some limit to how many machines its scaled out to (depending on load),
No, Cloudflare Workers don't work that way. They call it a homogenous deployment, the worker runs on the same machine that the request gets load balanced to, workers or not. There only scaling per say is that the same machine may run multiple isolates at once (each isolate handling some amount of active requests). It's both a really cool bit and a slightly annoying bit to work around, as your Workers can truly run everywhere, on every metal/machine, on every edge/colo/datacenter.
where the bottom tier uses in-memory: https://github.com/unkeyed/unkey/blob/main/packages/cache/src/stores/memory.ts
That's not normal worker memory looks like Durable Objects in-memory, Durable Objects only exist once globally per ID and will stay alive if you keep sending it constant requests nvm I followed it down the rabble hole of abstraction and it looks like it might not be used just in a Durable Object context, I was assuming it was because of their use of state, if the same user does a lot of requests per session they would keep hitting the same machine & isolate (and/or if your service is super popular), could be helpful
rayberra
rayberra8mo ago
Piggybacking on this question if I may. First, what happens when say a worker allocates/uses 50MB js heap inside the request handler for the lifetime of a somewhat slow request, and the client/app sends a handful of simultaneous requests to what I presume would hit the same metal? Trying to understand how the memory load/limit is handled. Second, in-memory caching sounds potentially really useful. Do I just do a global var pleaseDont_cachedThingy /* = stuff will come and go */;? My idea is to try to skip sending/processing the same input over and over again when doing e.g. inpainting with workers AI.
kian
kian8mo ago
There's a soft eviction threshold (in-flight requests will be allowed to finish, then the Worker is killed) and the hard eviction threshold (Worker is killed, in-flight requests get a 1102 Resources Exceeded error back) - so you can go a little over the 128 MB limit. The chances of hitting the same metal, outside of something like a browser with a connection kept alive, is pretty rare though.
Want results from more Discord servers?
Add your server