R
Railwayβ€’2mo ago
Xevion

Cron Missed Completely, No Logs

I worked on a cron application yesterday that fired at 6AM CST correctly on the first night, but did not do it again last night.
No description
31 Replies
Percy
Percyβ€’2mo ago
Project ID: 3c0425aa-9fe4-449a-a161-8a1efb3b53ee
Xevion
Xevionβ€’2mo ago
3c0425aa-9fe4-449a-a161-8a1efb3b53ee The schedule is 0 11 * * *, which means everyday at 11:00 AM UTC. Logs show the last invocation was on June 1st at 6:01 AM CST (correct, although 1 minute late?). The expected last invocation would actually be June 2nd at 6:00AM CST, but there is no activity, logs, or even an invocation on the dashboard. I only found out about this because my cron monitor raised an issue.
Xevion
Xevionβ€’2mo ago
The history below doesn't show anything interesting, if you're curious.
No description
Brody
Brodyβ€’2mo ago
from my understanding, there are too many jobs being ran at 11am utc that some get skipped, until the team addresses this i would recommend switching to an in-code scheduler
Xevion
Xevionβ€’2mo ago
are you serious? i literally just undid the node cron scheduler because it was consuming 50 MB memory constantly and i thought it'd be fun to move away from that not mad at you, just... that's pretty sucky
Brody
Brodyβ€’2mo ago
i feel you, you could also try another time? 10:30am utc?
Xevion
Xevionβ€’2mo ago
Yeah, I was thinking that some weird off-color time would be less likely to incur issues. Something like XX:48 The whole job is done in 5 seconds usually.
Brody
Brodyβ€’2mo ago
yep you got the right idea
Xevion
Xevionβ€’2mo ago
Gonna try 10:48 UTC and see what happens.
Brody
Brodyβ€’2mo ago
sounds good! i have also sent this thread in a thread i have with cooper for gathering cron issues like yours
Xevion
Xevionβ€’2mo ago
Alrighty; just my thought: a little detail about this being skipped, or likely to be skipped, or some transparency on the issues with cron would be nice. I don't mind that Railway's platform is in need of improvement; but letting users explore until they hit a landmine isn't ideal.
Brody
Brodyβ€’2mo ago
i assume they had never thought they would be over scheduled so they never designed error handing and the ui around it
Xevion
Xevionβ€’2mo ago
Kinda an interesting problem to think about in retrospect.
Brody
Brodyβ€’2mo ago
ideally the only issues that you could get out of a cron job would be an issue with the build or deploy
Xevion
Xevionβ€’2mo ago
Maybe a check-mark that says "I'm okay with this being rescheduled slightly" would be good. Crons that are more important could be charged at a higher rate, but they'll be prioritized on runners or whatever. If you have 600 jobs every day at 11AM UTC, spinning up tons of machines to work on them is not exactly smart. Especially when most of them are tiny jobs. Working in bursts would be better. And working early + late, like queueing. Start at 10:58 or even earlier to start executing.
Brody
Brodyβ€’2mo ago
im sure they have more than 600 at 11am utc, and if i recall correctly, its only a single schedular on their backplane
Xevion
Xevionβ€’2mo ago
aha i really have no idea about the scale railway works at tbh
Brody
Brodyβ€’2mo ago
i dont really either, im just going off the crumbs they give us i mean they do tell us a fair bit, but more info can never hurt in our position of community help
Xevion
Xevionβ€’4w ago
@Brody
No description
Xevion
Xevionβ€’4w ago
This is pretty sucky as crons go. I'm not sure what's going on, actually; I cannot tell if Railway is to fault here. Nevermind, seems like something with Sentry is going wrong? Error while running backup: AxiosError: connect EHOSTUNREACH 34.120.195.249:443
Brody
Brodyβ€’4w ago
host unreachable eh? you aren't the first person to see this error even after they resolved the incident are you on the legacy or v2 runtime? check your service settings, if legacy, switch it to v2
Xevion
Xevionβ€’4w ago
Got it, switched it to V2. Didn't know that was a setting lol.
Brody
Brodyβ€’4w ago
just for clarity, the v2 runtime has been confirmed to fix host unreachable, but it has no impact on cron being skipped since that's a completely different system
Xevion
Xevionβ€’4w ago
πŸ‘
Xevion
Xevionβ€’3w ago
Working good!
No description
Xevion
Xevionβ€’3w ago
No failures since.
Brody
Brodyβ€’3w ago
they made changes to the cron scheduler too
Xevion
Xevionβ€’7d ago
No description
Brody
Brodyβ€’7d ago
the changes they made did not help 😦 but thank you for trying
Xevion
Xevionβ€’7d ago
i mean i guess at least it happened the same day? lol very odd
Brody
Brodyβ€’7d ago
lol back to in code