Cloudflare Developers•6mo ago

For the new worker logs, is there a way to explicity designate which console.log statements should b

For the new worker logs, is there a way to explicity designate which console.log statements should be captured. I feel like the pricing model and how this is integrated to console.log statements make it a bit weird to use. I think a way to have balance is to have the logger be in runtime.env instead of hijacking console.log or at least be able to set the logging level at which logs should be captured (e.g only for console.error). This way we can have logging for important things and if we need normal debugging then the live stream logs should include all console.logs.

16 Replies

Shados•6mo ago

Hi all, can someone confirm to me if logging (both the observability, as well as logpush) is expected to work in preview urls? I got thrown for a loop today because it does not seem to generate any logs, so I couldn't figure out an exception until I promoted the version to a deploy. Is that currently expected? In Pages we didn't have log push, but you could view real time logs even for previews.

😈 Donkey 💫•6mo ago

I see a lot of log missing too and the filter on dashboard works incorrectly, sometimes

xochrisxo•6mo ago

Nicat die letuten 24 stunden

kchro3•6mo ago

Do we know what is the timeline for alerting support? I am considering Datadog as an alternative for now

rohin•6mo ago

early next year!

rohin•6mo ago

🔈 DX Improvement We introduced invocation logs to enrich the context we give you for your Workers invocation with zero instrumentation. However, we have heard from many of you that you want the ability to opt-out of invocation logs because a) added noise for high-volume log producers and b) billing implications. Starting today, you can disable invocation logs by adding the following to your wrangler.toml

[observability.logs]
invocation_logs = false

[observability.logs]
invocation_logs = false

You can also disable Invocation Logs via the Dashboard.

kchro3•6mo ago

that's awesome! if there are any beta testing openings etc, please let me know!

ElddisOne•6mo ago

Hi all, we're on CF Enterprise using Workers to manipulate the streaming HTML. We're looking through the logs and noticed that there are multipe entries for the same request. We're trying to find out what the events are in relation to the workers, because the CF documentation is quite thin. We are assuming that the "ParentRayID":"00" is the ingress request which is assigned the RayID of "1234", which then gets ran through a CF worker which creates a new ID "5678" and then returns that response to the client? We are uncertain because the EdgeResponseBytes is high for both entries. Any ideas? I've chopped some attributes out of the JSON object for brevity

  {
    "BotScore":2,
    "CacheCacheStatus":"dynamic",
    "ClientIP":"66.249.77.39",
    "ClientIPClass":"searchEngine",
    "EdgeResponseBytes":308840,
    "EdgeResponseContentType":"text/html; charset=UTF-8",
    "EdgeResponseStatus":200,
    "EdgeStartTimestamp":"2024-11-07T14:54:53Z",
    "EdgeTimeToFirstByteMs":1037,
    "ClientRequestURI":"/123",
    "ParentRayID":"1234",
    "RayID":"5678"
  }{
    "BotScore":1,
    "CacheCacheStatus":"dynamic",
    "ClientIP":"66.249.77.39",
    "ClientIPClass":"searchEngine",
    "EdgeResponseBytes":113263,
    "EdgeResponseContentType":"text/html; charset=UTF-8",
    "EdgeResponseStatus":200,
    "EdgeStartTimestamp":"2024-11-07T14:54:53Z",
    "EdgeTimeToFirstByteMs":1083,
    "ClientRequestURI":"/123",
    "BotDetectionTags":[
      "seo_imitator",
      "googlebot"
    ],
    "ParentRayID":"00",
    "RayID":"1234"
  }

  {
    "BotScore":2,
    "CacheCacheStatus":"dynamic",
    "ClientIP":"66.249.77.39",
    "ClientIPClass":"searchEngine",
    "EdgeResponseBytes":308840,
    "EdgeResponseContentType":"text/html; charset=UTF-8",
    "EdgeResponseStatus":200,
    "EdgeStartTimestamp":"2024-11-07T14:54:53Z",
    "EdgeTimeToFirstByteMs":1037,
    "ClientRequestURI":"/123",
    "ParentRayID":"1234",
    "RayID":"5678"
  }{
    "BotScore":1,
    "CacheCacheStatus":"dynamic",
    "ClientIP":"66.249.77.39",
    "ClientIPClass":"searchEngine",
    "EdgeResponseBytes":113263,
    "EdgeResponseContentType":"text/html; charset=UTF-8",
    "EdgeResponseStatus":200,
    "EdgeStartTimestamp":"2024-11-07T14:54:53Z",
    "EdgeTimeToFirstByteMs":1083,
    "ClientRequestURI":"/123",
    "BotDetectionTags":[
      "seo_imitator",
      "googlebot"
    ],
    "ParentRayID":"00",
    "RayID":"1234"
  }

Walshy•6mo ago

Yeah, so in this case - that one on top is a subrequest made from the Worker and the second one is the parent request So, request from Googlebot came in, it fired a Worker (log 2) and that Worker made a subrequest to xyz (log 1) You can match up parent to child by the ParentRayID like you already mentioned

ElddisOne•6mo ago

Thanks Walshy, that makes sense. We want to get the RayID which is returned to the client because we're matching the log entries here against a web crawler. For example Googlebot makes request -> Goes into CF ("RayID: 1234") -> Goes into Worker ("RayID: 5678") -> Response returned to Googlebot ("RayID: ?????")

Walshy•6mo ago

the response would be the highest parent, that's the request from the eyeball and the response to the eyeball So, 1234 in this case

ElddisOne•6mo ago

Ok great. In my example above, it is just simple HTML manipulation. Another case is that we are using Workers as a way to check a redirection database too. In those instances, if the origin server is expected to return a 200 status code and we intercept it via the Worker and the Worker changes it to a 301 redirect, would it then use the parent request or the worker request? The end goal is that we are trying to figure out which entry log we need to keep for some analysis of Googlebot. At the moment, having both entries is causing havoc because it's showing a url path can be both a 200 and a 301 at the exact same timestamp

Walshy•6mo ago

Workers are still part of the initial request so the parent - the top level one

ElddisOne•6mo ago

Nice, so we can assume if there is no ParentRayID or the ParentRayID = "00" then keep it. All others entries can be disgarded.

Luke•6mo ago

I'm trying to start using worker logs on my project but need a little better control over the sampling. Is there any way that I can programmatically determine whether a requests logs should be sampled or not? For example - I'm invoking 500MM worker requests a month and each request can log anywhere from 10-1000 messages which adds up very quickly. I don't want to persist debug logs unless there is an error. Or i don't want to persist any logs if its for a well known static content endpoint. I was thinking I could just log everything and then use a tail worker to check for the presence of an error log to decide what to send over to worker logs? I see, thank you! If I were to use a tail worker and just conditionally console.log from there w/ a 100% sample rate on that tail worker do you think that would accomplish the same thing? ok thanks I'll give it a shot. With tail workers do you know if there is any way to pass it additional events/data or is it just given the results of console.log/fetch/exceptions?

Thomas Ankcorn•6mo ago

For this scenario its always much easier to fix it in your code. We do plan on supporting tail sampling but you should use a logger with log level support 🙂

Gaming

Programming

For the new worker logs, is there a way to explicity designate which console.log statements should b

Did you find this page helpful?