For the new worker logs, is there a way to explicity designate which console.log statements should b

For the new worker logs, is there a way to explicity designate which console.log statements should be captured. I feel like the pricing model and how this is integrated to console.log statements make it a bit weird to use. I think a way to have balance is to have the logger be in runtime.env instead of hijacking console.log or at least be able to set the logging level at which logs should be captured (e.g only for console.error). This way we can have logging for important things and if we need normal debugging then the live stream logs should include all console.logs.
16 Replies
Shados
Shados•2mo ago
Hi all, can someone confirm to me if logging (both the observability, as well as logpush) is expected to work in preview urls? I got thrown for a loop today because it does not seem to generate any logs, so I couldn't figure out an exception until I promoted the version to a deploy. Is that currently expected? In Pages we didn't have log push, but you could view real time logs even for previews.
Bao N.
Bao N.•2mo ago
I see a lot of log missing too and the filter on dashboard works incorrectly, sometimes
xochrisxo
xochrisxo•2mo ago
Nicat die letuten 24 stunden
kchro3
kchro3•2mo ago
Do we know what is the timeline for alerting support? I am considering Datadog as an alternative for now
rohin
rohin•2mo ago
early next year!
rohin
rohin•2mo ago
🔈 DX Improvement We introduced invocation logs to enrich the context we give you for your Workers invocation with zero instrumentation. However, we have heard from many of you that you want the ability to opt-out of invocation logs because a) added noise for high-volume log producers and b) billing implications. Starting today, you can disable invocation logs by adding the following to your wrangler.toml
[observability.logs]
invocation_logs = false
[observability.logs]
invocation_logs = false
You can also disable Invocation Logs via the Dashboard.
No description
kchro3
kchro3•2mo ago
that's awesome! if there are any beta testing openings etc, please let me know!
ElddisOne
ElddisOne•2mo ago
Hi all, we're on CF Enterprise using Workers to manipulate the streaming HTML. We're looking through the logs and noticed that there are multipe entries for the same request. We're trying to find out what the events are in relation to the workers, because the CF documentation is quite thin. We are assuming that the "ParentRayID":"00" is the ingress request which is assigned the RayID of "1234", which then gets ran through a CF worker which creates a new ID "5678" and then returns that response to the client? We are uncertain because the EdgeResponseBytes is high for both entries. Any ideas? I've chopped some attributes out of the JSON object for brevity
{
"BotScore":2,
"CacheCacheStatus":"dynamic",
"ClientIP":"66.249.77.39",
"ClientIPClass":"searchEngine",
"EdgeResponseBytes":308840,
"EdgeResponseContentType":"text/html; charset=UTF-8",
"EdgeResponseStatus":200,
"EdgeStartTimestamp":"2024-11-07T14:54:53Z",
"EdgeTimeToFirstByteMs":1037,
"ClientRequestURI":"/123",
"ParentRayID":"1234",
"RayID":"5678"
}{
"BotScore":1,
"CacheCacheStatus":"dynamic",
"ClientIP":"66.249.77.39",
"ClientIPClass":"searchEngine",
"EdgeResponseBytes":113263,
"EdgeResponseContentType":"text/html; charset=UTF-8",
"EdgeResponseStatus":200,
"EdgeStartTimestamp":"2024-11-07T14:54:53Z",
"EdgeTimeToFirstByteMs":1083,
"ClientRequestURI":"/123",
"BotDetectionTags":[
"seo_imitator",
"googlebot"
],
"ParentRayID":"00",
"RayID":"1234"
}
{
"BotScore":2,
"CacheCacheStatus":"dynamic",
"ClientIP":"66.249.77.39",
"ClientIPClass":"searchEngine",
"EdgeResponseBytes":308840,
"EdgeResponseContentType":"text/html; charset=UTF-8",
"EdgeResponseStatus":200,
"EdgeStartTimestamp":"2024-11-07T14:54:53Z",
"EdgeTimeToFirstByteMs":1037,
"ClientRequestURI":"/123",
"ParentRayID":"1234",
"RayID":"5678"
}{
"BotScore":1,
"CacheCacheStatus":"dynamic",
"ClientIP":"66.249.77.39",
"ClientIPClass":"searchEngine",
"EdgeResponseBytes":113263,
"EdgeResponseContentType":"text/html; charset=UTF-8",
"EdgeResponseStatus":200,
"EdgeStartTimestamp":"2024-11-07T14:54:53Z",
"EdgeTimeToFirstByteMs":1083,
"ClientRequestURI":"/123",
"BotDetectionTags":[
"seo_imitator",
"googlebot"
],
"ParentRayID":"00",
"RayID":"1234"
}
Walshy
Walshy•2mo ago
Yeah, so in this case - that one on top is a subrequest made from the Worker and the second one is the parent request So, request from Googlebot came in, it fired a Worker (log 2) and that Worker made a subrequest to xyz (log 1) You can match up parent to child by the ParentRayID like you already mentioned
ElddisOne
ElddisOne•2mo ago
Thanks Walshy, that makes sense. We want to get the RayID which is returned to the client because we're matching the log entries here against a web crawler. For example Googlebot makes request -> Goes into CF ("RayID: 1234") -> Goes into Worker ("RayID: 5678") -> Response returned to Googlebot ("RayID: ?????")
Walshy
Walshy•2mo ago
the response would be the highest parent, that's the request from the eyeball and the response to the eyeball So, 1234 in this case
ElddisOne
ElddisOne•2mo ago
Ok great. In my example above, it is just simple HTML manipulation. Another case is that we are using Workers as a way to check a redirection database too. In those instances, if the origin server is expected to return a 200 status code and we intercept it via the Worker and the Worker changes it to a 301 redirect, would it then use the parent request or the worker request? The end goal is that we are trying to figure out which entry log we need to keep for some analysis of Googlebot. At the moment, having both entries is causing havoc because it's showing a url path can be both a 200 and a 301 at the exact same timestamp
Walshy
Walshy•2mo ago
Workers are still part of the initial request so the parent - the top level one
ElddisOne
ElddisOne•2mo ago
Nice, so we can assume if there is no ParentRayID or the ParentRayID = "00" then keep it. All others entries can be disgarded.
Luke
Luke•2mo ago
I'm trying to start using worker logs on my project but need a little better control over the sampling. Is there any way that I can programmatically determine whether a requests logs should be sampled or not? For example - I'm invoking 500MM worker requests a month and each request can log anywhere from 10-1000 messages which adds up very quickly. I don't want to persist debug logs unless there is an error. Or i don't want to persist any logs if its for a well known static content endpoint. I was thinking I could just log everything and then use a tail worker to check for the presence of an error log to decide what to send over to worker logs? I see, thank you! If I were to use a tail worker and just conditionally console.log from there w/ a 100% sample rate on that tail worker do you think that would accomplish the same thing? ok thanks I'll give it a shot. With tail workers do you know if there is any way to pass it additional events/data or is it just given the results of console.log/fetch/exceptions?
Thomas Ankcorn
Thomas Ankcorn•2mo ago
For this scenario its always much easier to fix it in your code. We do plan on supporting tail sampling but you should use a logger with log level support 🙂
Want results from more Discord servers?
Add your server