'Global' Worker for Logging (alternative for logflare App)
Cloudflare is discontinuing "Apps" at the end of August. The only "App" I am using is logflare (global http request / response logging) - https://cloudflareapps.com/apps/logflare
I've been looking into the best way to migrate to using Workers directly instead of "Apps". The issue I'm running into is, I have many existing Workers for my domain, on many different routes. The one nice thing about "Apps" was, they essentially were a "global" Worker that ran before any other worker in your zone. Because of this, logflare was able to "wrap" every request, including timing information, regardless of it went directly to Origin or through a Worker.
I'm wondering if anyone has suggestions for the best way to reproduce this behavior with my own Worker? As I said, for routes without existing workers, it is fairly straightforward. I can just create a custom logflare worker, which runs on those unused routes. My only guess on how to "wrap" existing workers (so that the logging includes their response timing), is turn off routes for all existing workers, and instead have on single "global" worker that inspects the incoming request and then calls the associated existing worker if a route exists for them.
This seems... very messy, and would like to avoid that at all costs if I can! Definitely feels like working against the system, instead of with it. Thank you for any tips or advice you can give.
29 Replies
Cloudflare Docs
Tail Workers · Cloudflare Workers docs
Track and log Workers on invocation by assigning a Tail Worker to your projects.
I actually did look into this as well (forgot to mention in the post). The trouble I ran into with tail-workers is they will pass all the request information over to use, but very little response information (only the HTTP status code, but not any other HTTP response headers, including whether it was cache hit, etc.)
Cloudflare Docs
Tail Handler · Cloudflare Workers docs
The tail() handler is the handler you implement when writing a Tail Worker. Tail Workers can be used to process logs in real-time and send them to a …
You could log those yourself? Not entirely ideal, but requires a good bit less work
I was thinking about that, but feel like I get into that weird chicken & egg issue again. On a route where the Worker handles the entire request, how do I log the "response" from that worker (so, when there's no 'fetch' called within that worker).
I just get stuck trying to come up with a way to "observe" the worker from the outside. Only hacky thing I've seen would be something like, running the existing workers in an entirely different zone / domain, and "fetching" that from the actual zone / domain I want logging on... I guess this is kinda how the Cloudflare Logflare app functioned.
Right before the
return
, slap a So, in the case where a Worker doesn't make any "fetch" to the Origin (instead building the response within the worker), this would still log the eventual HTTP response headers?
I might be confusing myself with the flow of things, as trying to think about how the Worker itself knows the eventual cache status (if it is determined later in the pipeline, by cache rules)
Note that if you don't invoke the cache API yourself, then nothing will be cached
Workers bypasses the CDN Cache Stage
Yes, that is correct
You can also add additional context on there if you want/need to
Hmm ok, I might have issues w/ cache control then, I was hoping it would still behave the same if I just added a simple pass-through worker that calls "fetch" on Origin, and then sends the logs over to Logflare... (that may just be another can of worms though) As far as the HTTP response headers, did a quick test and I see the content-type header of the response in Tail Worker now, however there's other values that make it to the client (web browser), but don't show in the Tail Worker log, e.g. "cf-ray" for instance.
I believe the Ray ID should™️ be present in the Request Headers?
Yeah, you are correct (seems to be in both). I more just used as example though, there are other headers like 'date' that also seem to be missing from response at the time of the console.log
Yeah, those might be added by higheer layers of the CDN unfortunately
Yeah, that's what I was afraid of. Which, I guess that's not really a Tail Worker issue, more of a "trying to gather data from within the Worker when it doesn't exist" issue, ha
So, I think I'm gonna be stuck on that... Unless I can figure out some other way to "wrap" the call to workers. Just seems like there isn't a great replacement for the way Apps were working, even if I tried to build it myself.
Perhaps Snippets + a worker? Haven't tried it, just took a look at the logflare source and the snippets docs out of curiosity. Snippets sounds a lot like " a "global" [limited] Worker that ran before any other worker in your zone". Something similar to https://developers.cloudflare.com/rules/snippets/examples/debugging-logs/.
@Raylight thank you, this seems very promising! I think I remember seeing snippets once, but it wasn't on my radar at all, so this is great. I did see they are much lower limits than full Workers (5 ms CPU time), but I think that should be ok. From my testing with Workers, the logging was only using ~2ms CPU average. I will definitely give snippets a shot when I get a chance, and report back on results. 😄
Update on the CPU time as well, I noticed some of the 99.9th percentile is above 5 ms, wondering how that will behave w/ the snippet limits (docs aren't clear to me)
@ArmoredCavalry what's happening in your tail worker? I barely hit 5ms, average is 0.9 - 1.5ms and top i've seen is 4.5ms and I do quite some transformations (joining / splitting strings, objects, generating a uuid, etc) on the logs before I send them to axiom.
Here's the snippet, using directly from Logflare (the worker their App runs) - https://github.com/Logflare/cloudflare-app/blob/master/workers/worker.js
GitHub
cloudflare-app/workers/worker.js at master · Logflare/cloudflare-app
The Cloudflare app for Logflare. Contribute to Logflare/cloudflare-app development by creating an account on GitHub.
I was thinking that it might be possible to let the snippet to do the bare minimum to collect the stats and put the rest of the code inside a worker.
@Raylight that's my plan as well, but those stats above are from my "passthrough" worker, that only runs snippet above (and calling Origin), no other Worker logic
There may be some room for performance improvements, I haven't looked closely at the code yet, since so far only been concerned with replacing the functionality one-to-one
@ArmoredCavalry what's the exact IP data that you need? Isn't stuff like city / country / coordinates / region available in the cf object? Wondering why you would need more from ipinfo.io, maybe you can skip that.
Anything w/ the request has been pretty straightfoward, as it is avaliable immediately. Bit I'm trying to solve really is things like runtime of the request (previously as seen in snippet, logflare wrapped the entire request in a timer), and the full response headers, after Cloudflare has finished processing it.
@ArmoredCavalry also makeid might be 'expensive', maybe you could replace that with Edit: actually, I looked makeid and it's relatively simple.
crypto.randomUuid()
or something even cheaper.Wouldn't it be possible to split the logflare code into two parts? The snippet would only do something like
I think that's pretty close to what is already happening? There's not much processing besides actually building the event body to fire over to Logflare. In theory the waitUntil should just be "dead time" at very end, waiting for Logflare server response to return before the script exits, so shouldn't really use any CPU time I'd think.
I'm currently trying out removing the "batching" logic from the code as well, to see if that had an impact on CPU time at all
Just an update, after letting the non-batching version of the worker run a while, seeing improved CPU usage, with 99.9th percentile under 5 ms.
So resource usage should be ok for Snippets, still not sure what happens if the snippet goes over CPU usage and times out (would hope fail open, but that might prove security issue for other people if default behavior?) Other thing I'm running into is, seems like you can't have async processes run after the response is returned w/ snippets like you can with workers, which could be a deal breaker, as I don't want to have the logflare request to be blocking.
That's kinda why I'm suggesting that you might need both a snippet and a worker. (I.e. the snippet does a fetch to a dedicated worker instead of logflare. The worker can use ctx.waitUntil.)
Yeah, unfortunately the call to the worker would still be blocking network request though right? I could see it maybe being faster (or more consistent timings) than calling 'out' to logflare endpoint, but still a shame no way to do it completely async 😦
hmmm.. you're right. Tested worker to worker (without service bindings) and got like ~10ms median latency.
I also just did a test w/ some heavy CPU function to test what happens when you hit the snippet limit. Definitely seems to give some wiggle room above 5 ms CPU time (which I'd expect). However, it you do hit the CPU limit, it fails closed, so the user will just get a "Error 1102 Worker exceeded resource limits" response. Big issue w/ that is, because it is a 'snippet' and not a full fledged Worker, as far as I can tell there's zero visibility for when this happens to the site owner? (since the snippet exits before firing off the logging request).
In theory shouldn't really occur much, if at all, but still makes me nervous. Since entire point of this all is to continue having complete logging / visibility of requests 😦