I’ve been testing out long-running

I’ve been testing out long-running workflows and wanted to share some observations: - Testing persistence outside of an execution loop: my main goal was to see if workflows keep their state properly when running in a loop (storing a reference of a step result outside of a loop) with longer sleep intervals (around 30 minutes). This seemed to work - state reference was not lost even when it most definitely hibernated during those long waits. - Switching versions mid-sleep: during the second sleep cycle, I deployed a new version of the workflow. Surprisingly, it moved the execution to the new version instead of sticking with the original one and even disregarded the current active sleep cycle. Shouldn’t it use the version it was invoked with? (source) - Dashboard wall time: the workflow mostly sat in sleep mode, but the dashboard showed a total wall time of almost 64 minutes, which doesn’t seem right given the actual activity. Adding more details here in a thread.
6 Replies
Jani
JaniOP3mo ago
Workflow (based on workflows-starter):
type State = {
lastExecution: number;
count: number;
};

export class TestWorkflow extends WorkflowEntrypoint<Env, Params> {
override async run(_event: WorkflowEvent<Params>, step: WorkflowStep) {
let state: State | undefined = undefined;
while (true) {
state = await step.do('do something', async () => {
const result = {
lastExecution: Date.now(),
count: (state?.count ?? 0) + 1,
};
console.log(result);
return result;
});

if (state.count >= 10) {
break;
}

await step.sleep('sleep', '30 minutes');
}
return state;
}
}
type State = {
lastExecution: number;
count: number;
};

export class TestWorkflow extends WorkflowEntrypoint<Env, Params> {
override async run(_event: WorkflowEvent<Params>, step: WorkflowStep) {
let state: State | undefined = undefined;
while (true) {
state = await step.do('do something', async () => {
const result = {
lastExecution: Date.now(),
count: (state?.count ?? 0) + 1,
};
console.log(result);
return result;
});

if (state.count >= 10) {
break;
}

await step.sleep('sleep', '30 minutes');
}
return state;
}
}
Jani
JaniOP3mo ago
Attaching the step history of a workflow that exited sleep after a new deployment and shifted to the updated version for the execution. Instance id: d5fa8dfa-0ed9-40f3-929e-496b00068b9d Let me know if you need anything else
No description
Jani
JaniOP3mo ago
The updated workflow reduced the sleep cycle to 1min - that should explain different start & end times above
Diogo Ferreira
Diogo Ferreira3mo ago
@Jani, tahnk you for the feedback! Happy to get confirmation from you that long sleep cycles work as expected. Regarding the version switching mid-sleep, we have identified a bug, when, at the moment of the update, the Workflow is on a sleep step (if in a do step, the original version will stick). Regarding the Wall time in the dashboard, notice that it refers to the "elapsed time in milliseconds between the start of a invocation, and when the runtime determines that no more JavaScript needs to run". Actual activity/work is measured in CPU time. For reference: https://developers.cloudflare.com/workers/observability/metrics-and-analytics/#wall-time-per-execution
Jani
JaniOP3mo ago
Good to know, definitely seemed like a bug. Ahh ok makes sense Thanks for getting back to me
Diogo Ferreira
Diogo Ferreira3mo ago
Any time!

Did you find this page helpful?