fix: stop leaking PromiseReactions in consumer loops#1400
Conversation
Promise.race against a never-settling Promise (Effect#closed,
Effect#cancel) attaches a .then reaction to that long-lived promise on
every call. Because the promise never settles, those reactions never run
and never get GC'd. Each leaked reaction's closure retains the awaiter
state, which keeps the resolved {frame, group} alive.
Over a multi-minute playback session this accumulates one retained Frame
per consumed video frame (~400MB after 9 minutes at 60fps) and grows the
PromiseReaction chain to tens of thousands of nodes, which inflates major
GC pauses to 100-200ms. That's the cause of the periodic ~700ms stalls.
Two changes, no API impact:
* @moq/hang Consumer.next() waits via an AbortSignal listener registered
with { once: true } and explicitly removed when notify wins, instead of
Promise.race([wait, signals.closed]).
* @moq/watch video and audio decoder loops drop the outer
Promise.race([consumer.next(), effect.cancel]). When the effect closes,
effect.cleanup runs consumer.close(), which makes next() return
undefined naturally and the loop exits.
WalkthroughThis PR refactors cancellation handling in frame consumption. Consumer.next() now includes explicit abort signal checking: it returns early if already aborted, otherwise waits for either an abort event or a notify event, cleaning up listeners afterward. The audio and video decoders simplify their consumer loops by removing explicit Promise.race calls against effect.cancel, delegating cancellation responsibility to Consumer.next()'s built-in abort signal handling instead. 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
js/watch/src/video/decoder.ts (1)
245-247:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThe
Promise.raceagainsteffect.cancelpattern in the decoder output callback should be migrated to an abort-listener pattern.The PR commit explicitly identifies this leak mechanism: every invocation attaches a never-settling reaction on
effect.cancel, which accumulates over time as frames are decoded. The callback fires ~60Hz, so during buffering or whensync.wait()delays, hundreds of leaked reactions accumulate with the same memory impact described in the PR (tens of thousands of PromiseReaction nodes inflating GC pauses).The PR successfully fixed this pattern in
Consumer.next()using an AbortSignal listener with{ once: true }. The same approach can be applied here to eliminate the accumulated reaction overhead.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@js/watch/src/video/decoder.ts` around lines 245 - 247, The decoder output callback currently races this.source.sync.wait(timestamp) against effect.cancel, which attaches accumulating never-settling reactions; replace that pattern with an AbortSignal listener: create an AbortController/derive a signal tied to effect.cancel, make a promise that resolves when signal.aborted via signal.addEventListener('abort', () => resolve(false), { once: true }), then race that abort-promise against this.source.sync.wait(timestamp) (or use Promise.race with the two promises) and ensure the listener is registered with { once: true } so no persistent reactions accumulate; update the code paths around this.source.sync.wait, effect.cancel, and the decoder output callback to use the abort-signal approach and clean up any created controller/listener references.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@js/watch/src/video/decoder.ts`:
- Around line 245-247: The decoder output callback currently races
this.source.sync.wait(timestamp) against effect.cancel, which attaches
accumulating never-settling reactions; replace that pattern with an AbortSignal
listener: create an AbortController/derive a signal tied to effect.cancel, make
a promise that resolves when signal.aborted via signal.addEventListener('abort',
() => resolve(false), { once: true }), then race that abort-promise against
this.source.sync.wait(timestamp) (or use Promise.race with the two promises) and
ensure the listener is registered with { once: true } so no persistent reactions
accumulate; update the code paths around this.source.sync.wait, effect.cancel,
and the decoder output callback to use the abort-signal approach and clean up
any created controller/listener references.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 371a57f7-0340-497d-a2bb-51ee2a4ccc30
📒 Files selected for processing (3)
js/hang/src/container/consumer.tsjs/watch/src/audio/decoder.tsjs/watch/src/video/decoder.ts
Promise.race against a never-settling Promise (Effect#closed, Effect#cancel) attaches a .then reaction to that long-lived promise on every call. Because the promise never settles, those reactions never run and never get GC'd. Each leaked reaction's closure retains the awaiter state, which keeps the resolved {frame, group} alive.
Over a multi-minute playback session this accumulates one retained Frame per consumed video frame (~400MB after 9 minutes at 60fps) and grows the PromiseReaction chain to tens of thousands of nodes, which inflates major GC pauses to 100-200ms. That's the cause of the periodic ~700ms stalls.
Two changes, no API impact:
@moq/hang Consumer.next() waits via an AbortSignal listener registered with { once: true } and explicitly removed when notify wins, instead of Promise.race([wait, signals.closed]).
@moq/watch video and audio decoder loops drop the outer Promise.race([consumer.next(), effect.cancel]). When the effect closes, effect.cleanup runs consumer.close(), which makes next() return undefined naturally and the loop exits.