fix(web): write PROCESSING status to DB before triggering transcription workflow#1832
fix(web): write PROCESSING status to DB before triggering transcription workflow#1832oaris-dev wants to merge 1 commit into
Conversation
…on workflow Prevents the share page polling loop (`/s/<videoId>` every 2s) from re-firing `start(transcribeVideoWorkflow, ...)` concurrently. Without this write, every poll passes the `!video.transcriptionStatus` check and queues another workflow dispatch. After a few concurrent enqueues, `@workflow/world-local`'s queue races on its internal buffers and detaches the underlying ArrayBuffer, crashing dispatch before any workflow step executes. By writing PROCESSING to the DB synchronously *after* the early-return check in `transcribeVideo` but *before* the `start()` call, the next poll's call to `transcribeVideo` hits the early-return and never re-fires `start()`. The first dispatch completes normally. Fixes CapSoftware#1550. Verified end-to-end by @julianwitzel on a self-hosted Docker deployment that was previously stuck in the ArrayBuffer crash loop. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to prevent duplicate transcription workflow dispatches from the share page’s 2-second polling loop by persisting transcriptionStatus = "PROCESSING" to the DB immediately before calling start(transcribeVideoWorkflow, ...) in transcribeVideo().
Changes:
- Writes
transcriptionStatus: "PROCESSING"synchronously intranscribeVideo()right before triggering the transcription workflow.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| await db() | ||
| .update(videos) | ||
| .set({ transcriptionStatus: "PROCESSING" }) | ||
| .where(eq(videos.id, videoId)); | ||
|
|
|
FYI — opened #1833 alongside this PR. When end-to-end validating this fix against a clean upstream build, I found a second blocker on self-hosted Docker: Both fixes are needed for self-hosted transcription to actually work end-to-end. They're independent code paths in different files, so I split them into two PRs for clarity, but happy to fold into one if you prefer. |
Summary
Fixes #1550. Writes
transcriptionStatus = 'PROCESSING'to the DB synchronously aftertranscribeVideo()'s existing early-return check but before thestart(transcribeVideoWorkflow, ...)call. Prevents the share page's 2-second polling loop from re-firingstart()concurrently, which races@workflow/world-local's queue → ArrayBuffer detachment → queue crash before any step executes.Root cause
The
/s/<videoId>page pollsgetVideoStatus()every 2 seconds. Upstream's flow:video.transcriptionStatus === null→ callstranscribeVideo()→ callsstart(transcribeVideoWorkflow, ...)→ workflow dispatched asynchronouslyvideo.transcriptionStatus === nullstill (the workflow'svalidateVideostep hasn't written PROCESSING yet) → callstranscribeVideo()again → callsstart()againAfter a few concurrent
start()calls,@workflow/world-local's queue races on its internal buffer handling, the underlyingArrayBuffergets detached, and dispatch crashes withTypeError: Cannot perform ArrayBuffer.prototype.slice on a detached ArrayBufferbefore any workflow step runs. The crash happens upstream of the step boundary, which is why affected users see no/audio/extractcalls in their media-server logs (per @julianwitzel's investigation in #1550).The fix
transcribeVideo()already has an early-return guard fortranscriptionStatus === 'PROCESSING'(lines 89–99). The problem is that nothing writes PROCESSING synchronously — the workflow's own step writes it, but only after the queue dispatches. If the queue race fires before the step runs, PROCESSING never gets written and the next poll triggers anotherstart().By writing PROCESSING to the DB inside
transcribeVideo()right before thestart()call, the next poll's call hits the early-return at line 89 and never re-firesstart(). The first dispatch completes normally.5-line change in
apps/web/lib/transcribe.ts.Verification
[transcribe] Probe result: audioCodec=aac, audioChannels=1, sampleRate=48000✓[transcribe] Extracted audio: 179860 bytes✓ (first time/audio/extractever succeeded in their setup)transcriptionStatus → COMPLETE✓[local world] Queue operation failed: ArrayBuffer detachedloopTest plan
DEEPGRAM_API_KEYconfigured — verified by @julianwitzel/s/<videoId>page, confirmtranscriptionStatustransitionsnull→PROCESSING→COMPLETEexactly once (no re-trigger loop)/video/probeand/audio/extractcallscap-weblogsNotes
This supersedes #1630 (which proposed bypassing the workflow entirely for transcription/AI generation). The bypass approach was overkill — this 5-line guard is the actual minimal fix.
🤖 Generated with Claude Code
Greptile Summary
This PR fixes a concurrency bug where the share page's 2-second polling loop could fire
transcribeVideo()multiple times before the workflow's own first step wrotePROCESSING, causing concurrentstart()calls that raced@workflow/world-local's internal buffer and crashed with anArrayBufferdetachment error. The fix writestranscriptionStatus = 'PROCESSING'to the DB synchronously insidetranscribeVideo(), just before thestart()call, so subsequent polls hit the existing early-return guard and never re-dispatch.apps/web/lib/transcribe.ts: persistsPROCESSINGto the DB beforestart(transcribeVideoWorkflow, …), and the existingcatchblock already resets the status back tonullifstart()throws, preserving error-recovery behavior.Confidence Score: 4/5
The change is minimal and well-reasoned; the existing catch block correctly resets status on failure, so the main risk is the narrow TOCTOU gap that is practically negligible at normal polling intervals.
The fix correctly targets the root cause and error recovery is preserved. The only remaining concern is a theoretical race that requires two requests to overlap within a single DB round-trip, which is unlikely at a 2-second polling interval but could surface if the interval is reduced or DB latency spikes.
apps/web/lib/transcribe.ts — the only changed file; review the interaction between the new PROCESSING write and the catch block's null reset if
start()fails.Important Files Changed
transcriptionStatus = 'PROCESSING'before callingstart(transcribeVideoWorkflow, ...), preventing re-entrant workflow dispatches from the share page's 2-second poll. The existing catch block already resets the status tonullon failure, so error recovery is handled. A very narrow TOCTOU window remains (two requests reading the video before either writes PROCESSING), but it is practically negligible at a 2-second polling interval.Comments Outside Diff (1)
apps/web/lib/transcribe.ts, line 118-129 (link)Two concurrent poll requests can both pass the
transcriptionStatus === 'PROCESSING'early-return guard (lines 84–93) before either one writes PROCESSING to the DB — the guard uses the value fetched at the top of the function, not a fresh read. The race window is now just the upload-phase query (~one round-trip), versus the entire workflow execution time before this fix, so it's highly unlikely to fire in practice at a 2-second polling interval. Worth noting in case the polling interval is ever reduced or the DB write is slow under load.Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (1): Last reviewed commit: "fix(web): write PROCESSING status to DB ..." | Re-trigger Greptile