Worker log files randomly include batches of entries with timestamps hours or days earlier

When fetching logs through the API, we often see hundreds of entries with timestamps earlier than the time period we requested.
On investigation, we found that the individual files in our log buckets (each nominally holding the items for about 1 minute) sometimes include batches of items that are stamped with times that are way outside the file's supposed range. Sometimes 100 or more such items in a row. Notably, the time stamps of these items typically proceed backwards - for example, the first out-of-sequence item is 30 minutes before the file's nominal start, and the hundredth item in the batch is 120 minutes before. On the scale of a whole day - with hundreds of thousands of log items in total - a thousand or fewer being out of sequence is usually of little consequence. But when fetching the logs from a narrower time range to diagnose a user's problem, the fact that some items are being filed away in entirely the wrong part of the bucket means that we can never be sure that a fetch has retrieved every item that falls within the requested range. This can make diagnosis extremely tricky. Is this a known issue? A known feature of logpush?
1 Reply
thelunz
thelunzOP3mo ago
Further information: Because some of our log items include a printed timestamp, we can see that these items, at least, truly belong to the instant identified by their timestamp property. It's not that the timestamps are somehow being overwritten haphazardly at the moment of writing to file. That suggests that the items are being held up somewhere - sometimes for under an hour, sometimes apparently for over a day - before being released in a batch and being thrown into (and usually at the end of) whatever 1-minute log file is about to be written.

Did you find this page helpful?