Skip to content

Commit ca1bc83

Browse files
fix(table): don't let parallel queued stamp overwrite a worker that already started
Stamps fire in chunks of 20 via Promise.all, so queued writes race with the worker's markWorkflowGroupPickedUp (running). When the late queued stamp landed second it overwrote running, and the cell looked stuck in queued for the rest of the run. Skip the stamp when the same execution is already past queued — the worker's authority wins.
1 parent 38cbefd commit ca1bc83

1 file changed

Lines changed: 19 additions & 9 deletions

File tree

apps/sim/lib/table/cell-write.ts

Lines changed: 19 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -56,15 +56,15 @@ export async function writeWorkflowGroupState(
5656
}
5757
const current = row.executions?.[groupId] as RowExecutionMetadata | undefined
5858
// Stale-worker guard: only blocks writes FROM an old worker (status =
59-
// running / completed / error / pending). A `queued` stamp is the scheduler
60-
// claiming the cell for a brand-new run — the new executionId is supposed
61-
// to overwrite whatever was there. Same for `cancelled` (authoritative).
62-
// Without this carve-out, the new run's stamp gets rejected and the cell
63-
// is stuck in its old state forever.
64-
const isAuthoritativeNewStamp =
65-
payload.executionState.status === 'queued' || payload.executionState.status === 'cancelled'
59+
// running / completed / error / pending). A `queued` stamp from the
60+
// scheduler can claim the cell for a brand-new run — that's the new
61+
// authority. Same for `cancelled` (always authoritative, written by stop).
62+
const isCancelStamp = payload.executionState.status === 'cancelled'
63+
const isQueuedStamp = payload.executionState.status === 'queued'
64+
const isNewQueuedStamp = isQueuedStamp && current?.executionId !== executionId
65+
const bypassStaleWorker = isNewQueuedStamp || isCancelStamp
6666
if (
67-
!isAuthoritativeNewStamp &&
67+
!bypassStaleWorker &&
6868
current &&
6969
current.executionId &&
7070
current.executionId !== executionId
@@ -74,6 +74,16 @@ export async function writeWorkflowGroupState(
7474
)
7575
return 'skipped'
7676
}
77+
// A late `queued` stamp for the SAME run that's already moved past queued
78+
// (worker called markWorkflowGroupPickedUp before our parallel stamp landed)
79+
// must NOT overwrite the further-along state. Without this, a cell can show
80+
// "queued" forever while the worker is actually running.
81+
if (isQueuedStamp && current?.executionId === executionId && current.status !== 'pending') {
82+
logger.info(
83+
`Skipping queued stamp — same run already at status=${current.status} (table=${tableId} row=${rowId} group=${groupId} executionId=${executionId})`
84+
)
85+
return 'skipped'
86+
}
7787
if (
7888
current?.status === 'cancelled' &&
7989
current.executionId === executionId &&
@@ -89,7 +99,7 @@ export async function writeWorkflowGroupState(
8999
// stamps from the scheduler also bypass — they ARE the new authority. Cell-
90100
// task writes (running/completed/error) get the SQL guard so an in-flight
91101
// partial can't clobber a stop click or a newer run that already committed.
92-
const cancellationGuard = isAuthoritativeNewStamp ? undefined : { groupId, executionId }
102+
const cancellationGuard = bypassStaleWorker ? undefined : { groupId, executionId }
93103
const result = await updateRow(
94104
{
95105
tableId,

0 commit comments

Comments
 (0)