Cron job stuck running...
For some reason my cron job is stuck. New executions haven't run for the last 3 hours (5 missed invocations).
I have attached screenshots of the cron runs tab and the deployment logs.
The cron job itself is super simple: it runs every 30mins and decides based on application code if it should make a fetch request or not at the appropriate time.
18 Replies
Project ID:
7440ea60-c734-47cc-ba5d-e8cabf480488
7440ea60-c734-47cc-ba5d-e8cabf480488
Going back through the logs I just noticed that it did this previously, from 9/17 3:30pm till today 9/26 5:00am which is nearly 9 days of missed executions. Thankfully this isn't in production yet but it will go live in a few months and I was hoping for it to be quite reliable.
What happens when you attempt to revert it?
To the old logic
There should be a feature flag within the service that you can enable that will restore the old cron logic
Something weird was happening where the interface was prompting me to manually approve deployments, all of the deployments that hung between 9/17 and 9/26 because they came from an outside contributor (but they didn't, just me)
I recreated the service and things seem to be running smoothly right now
The issue I am now facing is that certain executions of the cron job seem to take roughly 30 minutes. I have attached 3 examples (all occurred in the last 24hrs) where the cron job took a long time. Almost all executions are under 10s, most 5s or less.
Sorry how is this 30 minutes to run?
These look like multiple different runs
I don't know why, or why the logs look that way. They do look like multiple runs to me as well, but that's the logs from a single execution
Oh, crons now run multiple times per deployment instead of creating new deployments
You'll want to check the "Cron runs" tab
And that will show you the logs for each "run"
That's where I got those logs from, the cron runs tab.
Can you link me your project sorry?
Cause, your logs look correct. I'm assuming you're running this every 30 minutes
Oh ssorry found it aboev
Looks correct?
Oh I see what you mean. That time is definitely incorrect...
But it looks like it's spanning multiple executions, and each run is running correctly
PRO-3158 - Incorrect cron window for some crons
For some crons, it looks like we're not getting the exit event, and then setting it to the next execution time period. This makes the logs look like the cron is running twice
Status
Triage
Product
But ye this is cosmetic BTW
That's what I figured. Something about the presentation of runs and their logs are getting stuck together, since the duration of execution seems to be the 30min interval plus another execution
Thank you for investigating
Yep. I think this is the bug we saw where, rarely, an event just won't come through for the "end"
We'll fix that
Heads up I'm 99% certain I fixed this issue in the last 24h. I see your last occurance was Oct 3rd at 1:30am PST
PLEASE let us know if you see this again!
Great work, thanks Cooper!
Hey don't thank me yet! We gotta validate it over a while
It looked like it was getting that 30 minute "time" once per day, so imma check over the next few days
But again plz do let me know if you see anything wonky