Skip to content

[JS & Python] Add cancellation and structured error handling to live audio streaming#690

Open
rui-ren wants to merge 16 commits into
mainfrom
ruiren/js-sdk-live-audio-cancellation-and-coreerror
Open

[JS & Python] Add cancellation and structured error handling to live audio streaming#690
rui-ren wants to merge 16 commits into
mainfrom
ruiren/js-sdk-live-audio-cancellation-and-coreerror

Conversation

@rui-ren
Copy link
Copy Markdown
Contributor

@rui-ren rui-ren commented May 1, 2026

Summary

Adds cancellation and structured error handling to the JavaScript and Python SDKs' live audio transcription clients, closing two cross-language API parity gaps from the realtime-audio analysis.

What's new

Structured errors

Language Surface
JS New CoreError class code + isTransient from CoreErrorResponse exposed on every error thrown by start / pushLoop / stop. Re-exported from foundry-local-sdk.
Python FoundryLocalException messages still parse via existing CoreErrorResponse.try_parse; sample now demonstrates how to use it for transient-error recovery.

Cancellation

Language Surface
JS audioClient.createLiveTranscriptionSession(signal?: AbortSignal) set once on the session, applies to every subsequent start / append / stop / getTranscriptionStream call. Mirrors C# CancellationToken shape.
Python audio_client.create_live_transcription_session(cancel_event: Optional[threading.Event] = None) same pattern. Backpressured append() and idle get_transcription_stream() poll a documented _CANCEL_POLL_INTERVAL (100 ms) when cancellation is configured. Zero overhead on the fast path.

Both APIs are intentionally symmetric: cancellation is configured once on the constructor, not threaded through every method. Aborted append() does NOT enqueue the chunk (no risk of late delivery to native core). Aborted session also best-effort releases the native handle so it doesn't leak.

Files changed

File Change
sdk/js/src/openai/liveAudioTranscriptionTypes.ts New CoreError class + wrapCoreError() helper
sdk/js/src/openai/liveAudioTranscriptionClient.ts AbortSignal plumbing, abort-aware AsyncQueue.write(), listener-leak fixes, native-handle release on abort
sdk/js/src/openai/audioClient.ts createLiveTranscriptionSession(signal?)
sdk/js/src/index.ts Re-exports CoreError
sdk/js/test/openai/liveAudioTranscription.test.ts +CoreError tests, +AbortSignal listener-cleanup test (Proxy-wrapped to actually count listeners), +non-Error signal.reason test
samples/js/live-audio-transcription/app.js Demonstrates CoreError + AbortSignal graceful shutdown
sdk/python/src/openai/audio_client.py create_live_transcription_session(cancel_event=...)
sdk/python/src/openai/live_audio_transcription_client.py cancel_event plumbing, _CANCEL_POLL_INTERVAL constant
sdk/python/test/openai/test_live_audio_transcription.py +cancellation tests covering pre-set cancel, backpressure unblocking, generator clean exit
samples/python/live-audio-transcription/src/app.py Demonstrates cancel_event + CoreErrorResponse.try_parse

Tests

  • JS: 19/19 live-audio tests pass
  • Python: 25/25 live-audio tests pass
  • E2E synthetic-audio test still skips gracefully when the Nemotron model is not cached
  • Both samples syntax-verified

Backwards compatibility

100% every new parameter is optional. Existing callers see no behavior change.

Out of scope (parity report items not in this PR)

Addresses two cross-language API parity gaps in the JS SDK's live audio
transcription client:

1. Structured error handling
   - New `CoreError` class exposes `code` and `isTransient` from the
     native CoreErrorResponse so callers can implement targeted retry
     and telemetry logic.
   - `start()`, `pushLoop()`, and `stop()` now throw `CoreError`
     via `wrapCoreError()`. The `Push failed (code=...)` message
     prefix is preserved for log compatibility.

2. Cancellation
   - `start()`, `append()`, `stop()`, and
     `getTranscriptionStream()` accept an optional `AbortSignal`.
   - On abort, internal queues complete with a DOMException-style
     `AbortError` and the native session is torn down promptly.
   - Existing callers are unaffected (signal parameter is optional).

Tests
- Added 3 `CoreError` unit tests (structured, unstructured, non-Error
  causes). All 16 live-audio tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 1, 2026 19:32
@vercel
Copy link
Copy Markdown

vercel Bot commented May 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
foundry-local Ready Ready Preview, Comment May 13, 2026 6:40pm

Request Review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves the JS live audio transcription streaming API by adding (1) structured native-core error surfacing via a new CoreError type, and (2) optional cancellation via AbortSignal across streaming operations to close cross-language parity gaps.

Changes:

  • Introduces CoreError + wrapCoreError() to parse and surface code / isTransient from native-core error payloads.
  • Adds optional AbortSignal support to start(), append(), stop(), and getTranscriptionStream().
  • Re-exports CoreError from the JS SDK entrypoint and adds unit tests for structured/unstructured wrapping behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
sdk/js/src/openai/liveAudioTranscriptionTypes.ts Adds CoreError and wrapCoreError() to provide structured error handling for live audio streaming.
sdk/js/src/openai/liveAudioTranscriptionClient.ts Adds AbortSignal support and switches several error paths to throw CoreError instead of plain Error.
sdk/js/src/index.ts Re-exports CoreError from the public package entrypoint.
sdk/js/test/openai/liveAudioTranscription.test.ts Adds unit tests covering CoreError wrapping behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sdk/js/src/openai/liveAudioTranscriptionClient.ts Outdated
Comment thread sdk/js/src/openai/liveAudioTranscriptionClient.ts Outdated
Comment thread sdk/js/src/openai/liveAudioTranscriptionClient.ts Outdated
Comment thread sdk/js/src/openai/liveAudioTranscriptionClient.ts Outdated
Comment thread sdk/js/src/openai/liveAudioTranscriptionClient.ts Outdated
@rui-ren rui-ren changed the title Add CoreError and AbortSignal support to JS live audio streaming [JS] Add CoreError and AbortSignal support to JS live audio streaming May 1, 2026
Self-review of PR #690 surfaced four issues in the AbortSignal plumbing.
This commit fixes them and locks the behavior in with new tests.

Bug A: append() leaked an 'abort' listener every time the writePromise
won the race. Reusing the same AbortSignal across many appends in a
streaming session would trip Node's MaxListenersExceededWarning and
grow memory unbounded. Fix: register listener inside try, remove in
finally.

Bug B: handleExternalAbort() never called audio_stream_stop, so the
native session handle leaked on every abort. Fix: best-effort release
the handle in handleExternalAbort.

Bug C: getTranscriptionStream() set streamConsumed=true before checking
the signal. A pre-aborted signal would throw AbortError but permanently
disable the (single-use) stream. Fix: check abort first.

Bug D: stop() had the same listener-leak pattern as append() (one-shot,
but still ugly). Same fix.

Tests
- Added 2 unit tests covering listener cleanup over 20 race cycles
  and AbortError propagation during a race.
- 18/18 live-audio tests pass (was 16/16).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three legitimate issues raised by the PR reviewer that were not yet
fixed:

1. start() abort listener leaked when session stopped normally
   The listener registered on the caller's signal was never removed
   if start->stop ran without the external signal firing. The closure
   captured `this`, so a long-lived signal would keep the session
   instance alive.

   Fix: register the listener with `{ signal: sessionAbortController.signal }`
   so it is auto-removed when the session aborts internally (which
   stop() and handleExternalAbort() both trigger).

2. Non-Error abort reasons were dropped from error messages
   `signal.reason instanceof Error ? signal.reason.message : 'The
   operation was aborted.'` ignored callers using
   `controller.abort('timeout')` or other non-Error reasons.

   Fix: extract abortMessage(signal) helper that handles all three
   cases (Error / non-Error / undefined) and use it everywhere.

3. AsyncQueue.write() not abort-aware  chunks could be enqueued after
   the caller already saw AbortError
   When append() raced its write against an abort signal, an aborted
   write that was waiting on backpressure could later wake up and
   silently push the chunk to native core.

   Fix: AsyncQueue.write() now accepts an optional signal. On abort
   it removes the waiter from backpressureQueue so the item is never
   enqueued. append() delegates and drops its previous Promise.race
   wrapper.

Tests
- 19/19 live-audio tests pass (was 18/18). Added a non-Error reason
  test covering controller.abort('timeout'), Error reasons, and the
  default DOMException reason.

Reviewer comments 1 (handleExternalAbort native handle) and 3 (stop()
listener leak) were already addressed in the prior commit.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
expect(finalCount).to.equal(initialCount);
});

it('should propagate AbortError when signal is fired during race', async () => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show how these changes would work in the JS example and how they are different than the below error handling?

} catch (err) {
if (err.name !== 'AbortError') {
console.error('Stream error:', err.message);
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question the existing catch block is fine for the happy path but throws away every signal we now expose. Pushed 83e8dc4 to the sample (samples/js/live-audio-transcription/app.js) so the diff is concrete:

Before (the lines you linked):

} catch (err) {
    if (err.name !== 'AbortError') {
        console.error('Stream error:', err.message);
    }
}

This treats every non-abort error the same and loses code + isTransient.

After:

} catch (err) {
    if (err.name === 'AbortError') return;
    if (err instanceof CoreError) {
        if (err.isTransient) {
            console.warn(` Transient ASR error (\): \. Continuing...`);
            return;  // recoverable  don't kill the app on a momentary native stall
        }
        console.error(` Stream error [\]: \`);
        return;
    }
    console.error(' Stream error:', err.message);
}

So the user-visible difference is:

  • Transient blips (e.g., native-side recoverable hiccup) no longer silently log-and-die the loop continues, which is what a customer building a long-running transcription UI actually wants.
  • Hard errors include the machine-readable code, so customers can route on it (telemetry, retry policy, UI message) instead of regex-matching err.message.

For AbortSignal, I also wired it through to show what it buys:

const shutdown = new AbortController();
await session.start(shutdown.signal);
for await (const r of session.getTranscriptionStream(shutdown.signal)) { ... }
await session.append(pcm, shutdown.signal);

process.on('SIGINT', async () => {
    shutdown.abort();              // unblocks any in-flight append/iterator
    if (audioInput) audioInput.quit();
    await session.stop();           // now drain-and-stop without racing
    ...
});

Previously, if session.append() was backpressured (waiting for native-core to drain), Ctrl+C had to wait for stop() to finish draining which could take seconds. With shutdown.abort(), the in-flight append resolves immediately with AbortError, the pump exits, and we shut down cleanly.

Let me know if you'd like the sample changes split out into a separate PR.

Addresses reviewer question: 'Can you show how these changes would
work in the JS example and how they are different than the below
error handling?'

Three concrete changes to samples/js/live-audio-transcription/app.js:

1. Read-loop catch block now distinguishes:
   - AbortError    -> exit quietly (Ctrl+C)
   - CoreError + isTransient -> warn + continue (don't kill the app
     on a recoverable hiccup like a momentary native-side stall)
   - CoreError other -> error with the structured `code`
   - Anything else -> generic error log
   The previous catch lost all this information; the only signal was
   `err.message`.

2. A shared AbortController (`shutdown`) is threaded through
   `session.start()`, `session.append()`, and
   `session.getTranscriptionStream()`. SIGINT now calls
   `shutdown.abort()` first, so a backpressured `append()` (e.g.,
   waiting for native-core to drain) resolves promptly with AbortError
   instead of deadlocking the SIGINT handler.

3. The audio pump's catch swallows AbortError silently and stops
   re-pumping once the shutdown signal fires.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

* Must be called before append() or getTranscriptionStream().
* Settings are frozen after this call.
*
* @param signal - Optional AbortSignal. If aborted before or during start, an AbortError is thrown.
Comment on lines +157 to +158
const initialCount = (signal as any).listenerCount?.('abort') ?? 0;

Comment on lines +173 to +174
const finalCount = (signal as any).listenerCount?.('abort') ?? 0;
expect(finalCount).to.equal(initialCount);
Two legitimate issues from copilot-pull-request-reviewer:

1. start() JSDoc lied / dead code path
   The JSDoc claimed an abort 'before or during start' would throw,
   but start() runs synchronously up to the native call so a 'during'
   abort is impossible (single-threaded JS  no other microtask can
   fire while start() executes). The `if (signal.aborted)` branch
   after the native call was therefore dead code (the same condition
   was already caught by `throwIfAborted()` at the top).

   Fix:
   - Removed the dead `if (signal.aborted)` branch.
   - Updated JSDoc to accurately describe behavior: pre-aborted
     signals throw; in-flight aborts take effect on the next async
     boundary (via append() / getTranscriptionStream()).

2. Listener-leak test was a no-op
   The test asserted `signal.listenerCount?.('abort')` but
   AbortSignal extends EventTarget (not EventEmitter), so
   `listenerCount` is undefined. Both initialCount and finalCount
   evaluated to 0  the assertion always passed regardless of leaks.

   Fix: replaced with a Proxy-wrapped AbortSignal that intercepts
   `addEventListener` / `removeEventListener` and tracks live
   listener count + peak. Now asserts:
   - activeListeners === 0 after the loop (no leak)
   - peakListeners === 1 (no concurrent attachment)
   These would actually fail if the cleanup regressed.

Tests
- 19/19 live-audio tests still pass; the listener-leak assertion is
  now meaningful (verified by mentally regressing the fix  peak
  would grow to 20 without the finally-removeEventListener).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reviewer feedback: passing AbortSignal to every method is verbose;
also Python had no cancellation per the parity report.

JS  session-level signal
- `audioClient.createLiveTranscriptionSession({ signal })`  set ONCE,
  applied to all subsequent `start` / `append` / `stop` /
  `getTranscriptionStream` calls automatically.
- Per-call signals still work as overrides; if both are set they are
  composed via `AbortSignal.any` so EITHER aborting cancels.
- New `LiveAudioTranscriptionSessionOptions` interface re-exported.
- Sample updated to use the simpler one-line pattern (no signal threading).

Python  cancellation parity (was missing)
- `audio_client.create_live_transcription_session(cancel_event)`
  optional `threading.Event` set ONCE, used by all session methods.
- `start` / `append` / `stop` / `get_transcription_stream` also
  accept an optional per-call `cancel_event`; setting EITHER cancels.
- Backpressured `append()` and idle `get_transcription_stream()`
  poll a 100ms timeout when a cancel source is configured (zero overhead
  on the fast path with no cancel sources).
- Aborted `append()` does NOT enqueue the chunk (no late delivery).

Tests
- JS: 20/20 pass (added session+per-call signal composition test).
- Python: 26/26 pass (+4 cancellation tests covering pre-set cancel
  before start, backpressure unblocking, generator clean exit).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@rui-ren
Copy link
Copy Markdown
Contributor Author

rui-ren commented May 2, 2026

Both points addressed in df4260c.

JS session-level signal (no more passing it everywhere)

const shutdown = new AbortController();
const session = audioClient.createLiveTranscriptionSession({ signal: shutdown.signal });

await session.start();           // signal applies automatically
await session.append(pcm);       // signal applies automatically
for await (const r of session.getTranscriptionStream()) { ... }
await session.stop();

process.on('SIGINT', () => shutdown.abort());

Per-call signals still work as overrides if both session-level and per-call are set, they compose via AbortSignal.any so EITHER aborting cancels. Sample (samples/js/live-audio-transcription/app.js) now uses this one-line pattern.

Python cancellation parity (closes the gap from the analysis doc)

cancel = threading.Event()
signal.signal(signal.SIGINT, lambda *_: cancel.set())

session = audio_client.create_live_transcription_session(cancel_event=cancel)
session.start()
session.append(pcm)
for result in session.get_transcription_stream():
    ...
session.stop()

Same composition rule: each method also accepts an optional per-call cancel_event; either being set cancels. Backpressured append() and idle get_transcription_stream() poll on a 100 ms timeout when cancellation is configured; zero overhead on the fast path with no cancel sources. An aborted append() does NOT enqueue the chunk (no late delivery to native core).

Tests: JS 20/20, Python 26/26 (+4 new Python cancellation tests covering pre-set cancel, backpressure unblocking, and clean generator exit).

def append(
self,
pcm_data: bytes,
cancel_event: Optional[threading.Event] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we revert the session API changes to pass in cancel_event since we already have a cancel_event in the session constructor? We want JS and Python to have similar APIs for start, stop, append, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in eee2cbe. Reverted the per-call signal / cancel_event parameters on both JS and Python so cancellation is configured once on the constructor and the method signatures stay symmetric.

JS start(), append(), stop(), getTranscriptionStream() now take no abort parameter; they read this.sessionSignal directly. Removed the resolveSignal / AbortSignal.any composition machinery.

Python same: start(), append(), stop(), get_transcription_stream() take no cancel_event. _is_cancelled() simplified to check only the session-level event.

Final API now matches across both languages:

const session = audioClient.createLiveTranscriptionSession({ signal: shutdown.signal });
await session.start();
await session.append(pcm);
for await (const r of session.getTranscriptionStream()) { ... }
await session.stop();
session = audio_client.create_live_transcription_session(cancel_event=cancel)
session.start()
session.append(pcm)
for r in session.get_transcription_stream(): ...
session.stop()

Tests: JS 19/19, Python 25/25 (-2 tests for the dropped per-call paths; backpressure-unblock test still validates the session-level cancel works).

…_stream

Reviewer feedback (kunal-vaishnavi): the per-call signal/cancel_event
parameters were redundant with the session-level one set on the
constructor, and they made JS and Python diverge from each other.

JS:
- Removed `signal?: AbortSignal` parameter from `start()`,
  `append()`, `stop()`, and `getTranscriptionStream()`.
- Removed `resolveSignal()` helper (no longer needed) and the
  `AbortSignal.any` composition test.
- All four methods now read directly from `this.sessionSignal` set
  via `createLiveTranscriptionSession({ signal })`.

Python:
- Removed `cancel_event` parameter from `start()`, `append()`,
  `stop()`, and `get_transcription_stream()`.
- `_is_cancelled()` simplified to just check the session-level event.
- Removed the per-call cancellation test.

Both APIs are now symmetric: cancellation is configured ONCE at
session-creation time and applies to every subsequent operation.

Tests
- JS: 19/19 pass (was 20; removed the now-unused composition test).
- Python: 25/25 pass (was 26; removed per-call cancel test).
- Backpressure-unblocking test still validates the session-level
  cancel_event short-circuits a blocked append().

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dio-cancellation-and-coreerror

# Conflicts:
#	sdk/python/src/openai/live_audio_transcription_client.py
const audioClient = model.createAudioClient();
const session = audioClient.createLiveTranscriptionSession();
const session = audioClient.createLiveTranscriptionSession({ signal: shutdown.signal });

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of creating a new class to represent the parameters passed into the session, can we instead make an optional parameter for signal in createLiveTranscriptionSession? Then, we can remove the need for the LiveAudioTranscriptionSessionOptions class. The new optional parameter for signal can behave similar to a CancellationToken parameter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I will update the cancellation pattern. Thanks

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 6ab38c4. Dropped LiveAudioTranscriptionSessionOptions and made signal a plain optional parameter on createLiveTranscriptionSession, matching C#'s CancellationToken shape:

const shutdown = new AbortController();
const session = audioClient.createLiveTranscriptionSession(shutdown.signal);

(was createLiveTranscriptionSession({ signal: shutdown.signal }))

JS 19/19 tests pass; sample updated.

@rui-ren rui-ren requested a review from kunal-vaishnavi May 2, 2026 19:55
Comment thread sdk/python/src/openai/live_audio_transcription_client.py Outdated
Comment thread sdk/python/src/openai/live_audio_transcription_client.py Outdated
def create_live_transcription_session(self) -> LiveAudioTranscriptionSession:
def create_live_transcription_session(
self,
cancel_event: Optional[threading.Event] = None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question as here but for the Python example

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 143a8b6. Updated samples/python/live-audio-transcription/src/app.py with the same pattern as the JS sample concrete diff so the API value is visible.

Before: the sample created the session with no cancellation token and used a parallel local stop_event for the mic loop, which couldn't reach inside the SDK:

session = audio_client.create_live_transcription_session()
...
stop_event = threading.Event()
def shutdown(*_):
    stop_event.set()  # only the mic loop sees this
    session.stop()    # has to drain everything

After: one event drives the whole shutdown path SIGINT just sets it:

shutdown_event = threading.Event()
session = audio_client.create_live_transcription_session(cancel_event=shutdown_event)
...
def shutdown(*_):
    shutdown_event.set()
    #  this single call:
    #   - aborts any in-flight session.append() blocked on backpressure
    #     (FoundryLocalException, no late delivery to native core)
    #   - ends session.get_transcription_stream() iteration cleanly
    #   - short-circuits session.stop()'s drain wait
    #   - exits the mic capture loop
    session.stop()

I also added CoreErrorResponse.try_parse in the read loop to inspect structured error metadata (code + is_transient) transient blips no longer kill long-running sessions, mirroring the JS CoreError demo.

Tests: 25/25 pass; sample syntax-verified.

Reviewer feedback (kunal-vaishnavi): same ask as the JS sample
show how the new cancellation API actually gets used.

Three concrete changes to samples/python/live-audio-transcription/src/app.py:

1. The session is now created with the shutdown event:
   `audio_client.create_live_transcription_session(cancel_event=shutdown_event)`
   so every subsequent `start` / `append` / `stop` /
   `get_transcription_stream` call picks up cancellation
   automatically  no per-call event threading.

2. SIGINT handler just calls `shutdown_event.set()`. That single
   call:
   - aborts any in-flight `session.append()` blocked on backpressure
     with FoundryLocalException (no late delivery to native core),
   - ends `session.get_transcription_stream()` iteration cleanly,
   - short-circuits `session.stop()`'s drain wait,
   - exits the mic capture loop on its next iteration.
   The previous `stop_event` was a parallel local-only flag that
   couldn't reach inside the SDK; the new pattern uses one event
   end-to-end.

3. Read loop now demonstrates `CoreErrorResponse.try_parse` to
   inspect structured native-side error metadata (code + is_transient)
   so transient blips don't kill long-running sessions  the same
   value-add the JS sample shows with `CoreError`.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reviewer (kunal-vaishnavi) flagged the magic `timeout=0.1` in
`append()`, `get_transcription_stream()`, and `stop()`. Extracted
to a module-level `_CANCEL_POLL_INTERVAL` with a docstring explaining:

- Why it exists at all: `queue.Queue.get` / `put` cannot be
  interrupted by a `threading.Event` in standard Python, so when a
  `cancel_event` is configured we poll-with-timeout.
- Why 100 ms specifically: balances cancellation latency (SIGINT takes
  effect within ~100 ms) against idle CPU overhead (~10 wakeups/sec
  per blocked call, negligible).
- That this is no-op on the fast path with no cancel_event.

Tests: 25/25 still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
kunal-vaishnavi
kunal-vaishnavi previously approved these changes May 2, 2026
@rui-ren rui-ren changed the title [JS] Add CoreError and AbortSignal support to JS live audio streaming [JS & Python] Add cancellation and structured error handling to live audio streaming May 2, 2026
// AbortError instead of waiting for stop() to finish draining the queue.
process.on('SIGINT', async () => {
console.log('\n\nStopping...');
shutdown.abort();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is my understanding correct that shutdown and session.stop do the same thing but shutdown is similar to session.stop(hard=true) so we avoid draining the queue? Should we use the same API to represent that, or do we need separate handlers for shutdown vs session.stop()?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They're not the same thing — they're orthogonal:

  • shutdown.abort() cancels in-flight calls (e.g., append() blocked on backpressure, getTranscriptionStream() waiting for
    results).
  • session.stop() signals end-of-audio to native core, drains the queue, and releases the native handle.

Comment thread samples/js/live-audio-transcription/app.js Outdated
// Use it to retry quietly on transient blips instead of dying on the
// first hiccup. Without CoreError the only signal would be err.message.
if (err instanceof CoreError) {
if (err.isTransient) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it expected for FoundryLocalCore to throw transient errors? What do transient errors mean? If we detect transient errors, and there is no action that the application/user should take during such errors, can we catch them before they reach the application layer and continue as necessary?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right the transient branch was speculative (no documented list of transient codes from FLC core today, no SDK-internal retry policy yet). Pushed dfc9dee that drops the transient handling from both the JS and Python samples.

What changed:

  • Removed the if (err.isTransient) ... branch from both samples
  • Added a NOTE comment in each sample saying SDK-internal retry on transient errors is future work
  • Samples still demonstrate the value of the structured error type customers can log/route on LiveAudioStreamError.code (JS) or CoreErrorResponse.try_parse(...).code (Python) instead of regex-matching err.message

What stayed:

  • LiveAudioStreamError.isTransient and CoreErrorResponse.is_transient are still on the SDK API surface (no breaking change). Advanced users can still inspect them. We just don't lead the typical sample reader to expect recovery behavior the SDK doesn't deliver.

When the SDK gets a real retry policy + a documented set of transient codes, we can re-introduce internal handling and the field becomes meaningful end-to-end. Happy to file a follow-up issue tracking that work let me know.

Comment thread samples/python/live-audio-transcription/src/app.py Outdated
Comment thread sdk/js/src/openai/liveAudioTranscriptionTypes.ts Outdated
Comment thread sdk/python/src/openai/live_audio_transcription_client.py Outdated
@kunal-vaishnavi kunal-vaishnavi dismissed their stale review May 4, 2026 17:59

We have decided that we don't need to merge this PR fix for the FL 1.1 release. We can wait to get this PR right before merging.

Reviewer feedback (baijumeswani): the `CoreError` name implied a
generic infrastructure error, but it's only thrown by the live audio
streaming path (chat / audio-file transcription / responses still throw
plain `Error`). Rename to be honest about scope.

Bulk-rename across 5 files:
- `CoreError` -> `LiveAudioStreamError`
- `wrapCoreError` -> `wrapAsLiveAudioStreamError`
- `CoreErrorResponse` (the wire-format struct) is unchanged.

The structured `code` / `isTransient` fields and behavior are
preserved  this is a pure rename. Generalizing to a common error
type across all SDK APIs is left as a follow-up.

Tests: 19/19 still pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reviewer feedback (baijumeswani): the previous polling-based
cancellation added unnecessary complexity. Replace with an
event-driven implementation backed by `threading.Condition`.

Implementation
- New `_CancellableQueue` class: bounded FIFO using a single
  `threading.Condition`. Blocked `put` / `get` calls wake
  instantly when `cancel()` fires `notify_all()`  zero polling.
- One watcher thread per session waits on the user's `cancel_event`
  and calls `cancel()` on both queues when it fires.
- Removed `_CANCEL_POLL_INTERVAL` and all `timeout=...` polling
  loops in `append` / `get_transcription_stream` / `stop`.

Watcher lifecycle
- The watcher is a daemon. If `cancel_event` is never set (normal
  `stop()` path), it remains blocked on `Event.wait()` for the
  lifetime of the user's event object. Documented in the docstring.
- Bounded leak: 1 thread per session, daemon (no process-exit block),
  ~8KB stack. Acceptable trade-off for zero polling overhead.
- Python's stdlib has no clean wait-on-either-event primitive; the
  alternatives (asyncio bridge, helper-thread mirror) carry equal or
  worse complexity.

Tests
- 26/26 pass (was 25; +1 new test `test_cancellation_unblocks_instantly_no_polling`
  asserts cancel latency < 50 ms  the old polling impl would take up
  to 100 ms, so this locks in the no-polling guarantee as a regression
  test).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…dio-cancellation-and-coreerror

# Conflicts:
#	samples/python/live-audio-transcription/src/app.py
#	sdk/js/src/index.ts
#	sdk/js/src/openai/liveAudioSession.ts
#	sdk/js/test/openai/liveAudioTranscription.test.ts
#	sdk/python/src/openai/live_audio_session.py
Audit after merging origin/main into branch caught 8 stale references
to the pre-rename API. Main renamed:
- `getTranscriptionStream` -> `getStream` (JS)
- `get_transcription_stream` -> `get_stream` (Python)
- `live_audio_transcription_types` -> `live_audio_types` (Python module)
- `liveAudioTranscriptionTypes` -> `liveAudioTypes` (JS module)

Issues found and fixed:

1. samples/python/live-audio-transcription/src/app.py  REAL BUG
   The `CoreErrorResponse` import pointed at the deleted module
   `foundry_local_sdk.openai.live_audio_transcription_types`. Sample
   would have crashed on startup with ModuleNotFoundError. Fixed import
   path + the get_transcription_stream() call further down.

2-7. Stale docstring/comment references to the old method names in:
   - samples/js/live-audio-transcription/app.js (2 comments)
   - sdk/js/src/openai/audioClient.ts (2 JSDoc lines)
   - sdk/python/src/openai/audio_client.py (1 docstring line)
   Cosmetic only  code worked but docs lied.

Verified clean: `git grep` for any old API name returns zero matches.

Tests
- Python: 26/26 pass
- JS: 19/19 pass
- Sample imports verified at AST level + module-resolve level

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reviewer feedback (baijumeswani): if transient errors require no user
action, they should be handled internally rather than surfaced to the
application layer. Until the SDK has an internal retry policy AND a
documented list of transient codes, the sample's transient branch is
speculative  it pretends to handle errors that may never actually
fire with isTransient=true.

Removed the transient-vs-terminal distinction from both samples:
- samples/js/live-audio-transcription/app.js
- samples/python/live-audio-transcription/src/app.py

Both samples still demonstrate the value of the structured error type
(LiveAudioStreamError.code in JS, CoreErrorResponse.try_parse(...).code
in Python)  customers can log/route on a machine-readable code rather
than regex-matching err.message. The samples just don't pretend to
recover from "transient" errors anymore.

Added a NOTE comment in each sample explaining that internal retry on
transient errors is future work.

The SDK API itself (LiveAudioStreamError.isTransient and
CoreErrorResponse.is_transient) is unchanged  advanced users can still
inspect those fields. We just don't lead customers to expect recovery
behavior the SDK doesn't deliver.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants