Skip to content

Serialize event dispatch in .NET and Go SDKs#791

Open
stephentoub wants to merge 1 commit intomainfrom
serial-event-dispatch
Open

Serialize event dispatch in .NET and Go SDKs#791
stephentoub wants to merge 1 commit intomainfrom
serial-event-dispatch

Conversation

@stephentoub
Copy link
Collaborator

@stephentoub stephentoub commented Mar 11, 2026

Summary

Decouple user event handler dispatch from the transport in both .NET and Go SDKs. Events are routed through a channel, consumed by a single background goroutine/task that invokes user handlers serially in FIFO order. Broadcast handlers (tool calls, permission requests) are fired as fire-and-forget directly from the dispatch entry point, outside the channel.

This brings .NET and Go in line with the Node.js and Python SDKs, which get natural serialization from their single-threaded event loops.

Problem

In .NET, StreamJsonRpc dispatches notifications concurrently on the ThreadPool, which meant:

  • Event handlers could run concurrently with themselves (two events arriving at once)
  • Event handlers could run out of order

In Go, the JSON-RPC read loop was single-threaded so user handlers were already serialized, but broadcast handlers (tool calls, permission requests) ran inline, which meant:

  • A broadcast handler issuing an RPC request would deadlock — the response can only be read by the read loop, which is blocked waiting for the handler to return
  • A slow or stalled broadcast handler would block all event delivery

Design

Both SDKs now use the same pattern, adapted to their runtime:

User handlers Broadcast handlers Blocks reader?
Node.js Sync inline (event loop) void fire-and-forget No
Python Sync inline (with lock) asyncio.ensure_future No
Go Channel → consumer goroutine go goroutine No
.NET Channel → consumer task _ = fire-and-forget task No

Key properties:

  • Serial dispatch: user handlers see events one at a time, never concurrently
  • FIFO ordering: events delivered in arrival order
  • No transport blocking: the reader thread/goroutine returns immediately after enqueueing
  • No deadlocks: broadcast handlers run in their own goroutine/task, decoupled from the reader and the consumer

.NET Changes

  • Unbounded Channel<SessionEvent> with single-reader consumer task (ProcessEventsAsync) that invokes user handlers serially
  • DispatchEvent enqueues to channel and fires broadcast handler as _ = HandleBroadcastEventAsync(...) (fire-and-forget)
  • Per-handler error catching via ImmutableArray<SessionEventHandler> iteration with cached snapshot
  • HandleBroadcastEventAsync wraps broadcast work in try/catch directly (no separate Safe wrapper)
  • Channel completed in DisposeAsync; consumer task exits naturally
  • ILogger threaded from CopilotClient to CopilotSession for diagnostics

Go Changes

  • eventCh channel added to Session; processEvents consumer goroutine started in newSession
  • dispatchEvent enqueues to channel and fires go s.handleBroadcastEvent(event)
  • Channel closed in Disconnect to stop the consumer
  • Panic recovery per user handler to prevent one handler from halting delivery

Tests

SDK Test What it validates
.NET Should_Receive_Session_Events (modified) Serial dispatch — concurrent count never exceeds 1; awaits async delivery
.NET Handler_Exception_Does_Not_Halt_Event_Delivery (new) Consumer loop survives handler exceptions
.NET DisposeAsync_From_Handler_Does_Not_Deadlock (new) No deadlock when handler calls DisposeAsync
Go events are dispatched serially (new) Serial dispatch — max concurrent count is 1
Go handler panic does not halt delivery (new) Panic recovery keeps dispatch alive
Go should_receive_session_events (modified) Awaits async delivery of session.start

@stephentoub stephentoub requested a review from a team as a code owner March 11, 2026 19:01
Copilot AI review requested due to automatic review settings March 11, 2026 19:01
@stephentoub stephentoub enabled auto-merge March 11, 2026 19:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to enforce serialized (in-order, non-concurrent) dispatch of session events and broadcast work (tool/permission handling) in the Go and .NET SDKs to prevent out-of-order delivery, concurrent handler execution, and untracked broadcast work.

Changes:

  • Go: run broadcast tool/permission handling inline (remove untracked goroutines) and update Disconnect docs.
  • .NET: enqueue incoming events into a single-reader channel and process them in a dedicated consumer loop that awaits broadcast handling before invoking user handlers.
  • Add/adjust tests to validate serial dispatch, handler exception resilience, and DisposeAsync-from-handler behavior.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
go/session.go Removes goroutine spawning for broadcast work to attempt fully serialized handling.
go/session_test.go Adds tests intended to validate serial dispatch and panic resilience.
dotnet/src/Session.cs Introduces Channel-based serialized event dispatch loop and logging for exceptions/dropped events.
dotnet/src/Client.cs Threads ILogger into CopilotSession construction.
dotnet/test/SessionTests.cs Adds/updates tests for serial dispatch, exception resilience, and dispose deadlock avoidance.
Comments suppressed due to low confidence (1)

dotnet/test/SessionTests.cs:516

  • TaskCompletionSource is generic in the target frameworks; new TaskCompletionSource() will not compile. Use a typed TaskCompletionSource (e.g., TaskCompletionSource<bool> or TaskCompletionSource<object?>) and set an explicit result when completing it.
        var session = await CreateSessionAsync();
        var disposed = new TaskCompletionSource();

        session.On(evt =>
        {
            if (evt is UserMessageEvent)
            {
                // Call DisposeAsync from within a handler — must not deadlock.
                session.DisposeAsync().AsTask().ContinueWith(_ => disposed.TrySetResult());
            }
        });

You can also share your feedback on Copilot code review. Take the survey.

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review

This PR improves event dispatch serialization in .NET and Go SDKs, which is excellent for correctness. However, it introduces a behavioral inconsistency with Node.js and Python SDKs regarding broadcast event handling.

Current State After This PR

SDK Event Dispatch Broadcast Handlers (tools/permissions)
.NET ✅ Serialized (Channel-based queue) Awaited inline before user handlers
Go ✅ Serialized (removed go keyword) Awaited inline before user handlers
Node.js ✅ Serialized (single event loop) Fire-and-forget (void this._execute...)
Python ✅ Serialized (single asyncio loop) Fire-and-forget (asyncio.ensure_future())

The Inconsistency

In .NET/Go (after this PR):

// dotnet/src/Session.cs:315-319
await HandleBroadcastEventAsync(sessionEvent);  // ⬅️ Awaited inline
// ...
EventHandlers?.Invoke(sessionEvent);  // User handlers run AFTER tool completes

In Node.js:

// nodejs/src/session.ts:338
void this._executeToolAndRespond(...);  // ⬅️ Fire-and-forget
// Falls through immediately to user handler dispatch

In Python:

# python/copilot/session.py:292-296
asyncio.ensure_future(
    self._execute_tool_and_respond(...)  # ⬅️ Fire-and-forget
)
# Falls through immediately to user handler dispatch

Why This Matters

  1. Timing guarantees differ: In .NET/Go, when a user handler sees external_tool.requested, the tool has already been invoked and responded. In Node.js/Python, the tool call might still be in flight.

  2. Error visibility: .NET/Go catch tool handler exceptions via LogBroadcastHandlerError. Node.js/Python let them become unhandled promise rejections (per PR description: "rejections propagate as unhandled promise rejections").

  3. Concurrency semantics: The PR description states "Node.js and Python are single-threaded (event loop / asyncio) and do not have these issues" — but this is only partially true. While event handler dispatch is serial, broadcast handlers run concurrently with subsequent user handlers in Node.js/Python but not in .NET/Go.

Recommendation

Option A (Preferred): Update Node.js and Python to match .NET/Go behavior:

  • Change void this._executeToolAndRespond()await this._executeToolAndRespond() (Node.js)
  • Change asyncio.ensure_future()await (Python)
  • This ensures all four SDKs have identical serial semantics

Option B: Document the difference as intentional:

  • Update SDK documentation to note that Node.js/Python use concurrent broadcast handlers while .NET/Go serialize them
  • Explain when this behavioral difference might be observable to users

Option C: Keep as-is if this was a deliberate choice:

  • If the fire-and-forget model in Node.js/Python is intentional (for performance or other reasons), please document why the SDKs differ in this regard

Related Code Locations

  • Node.js broadcast handlers: nodejs/src/session.ts:325-349
  • Python broadcast handlers: python/copilot/session.py:274-311
  • .NET broadcast handlers (after PR): dotnet/src/Session.cs:280-332
  • Go broadcast handlers (after PR): go/session.go:436-501

Generated by SDK Consistency Review Agent for issue #791 ·

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated by SDK Consistency Review Agent for issue #791

@github-actions
Copy link
Contributor

SDK Consistency Review: Serial Event Dispatch

I've reviewed this PR for cross-SDK consistency. The changes correctly address a concurrency issue specific to Go and .NET that doesn't affect Node.js and Python.

✅ Design Consistency Confirmed

The PR correctly identifies and addresses architectural differences:

Go & .NET (multi-threaded runtimes):

  • Before: Event handlers could run concurrently, out of order, and broadcast work (tools/permissions) was fire-and-forget
  • After: Events are serialized via a channel (Go) or Channel(T) (.NET) with a single consumer loop

Node.js & Python (single-threaded event loops):

  • Inherently serial: Event loop guarantees handlers run one at a time
  • No changes needed: The PR description correctly states "Node.js and Python are single-threaded (event loop / asyncio) and do not have these issues"

I verified both Node.js and Python dispatch events synchronously on their event loops:

  • Node.js: _dispatchEvent() calls handlers synchronously in a for-loop
  • Python: _dispatch_event() calls handlers synchronously in a for-loop
  • Both use asyncio.ensure_future() / void for broadcast work, which is safe on single-threaded event loops

✅ API Surface Consistency

Public API is unchanged across all SDKs:

  • session.on() / session.On() still registers handlers
  • Handler signatures remain the same
  • No breaking changes to user-facing behavior

Implementation details appropriately diverge based on runtime characteristics:

  • Go/C#: Explicit serialization via channels
  • Node/Python: Implicit serialization via event loop

✅ Documentation Consistency

The PR adds consistent caller-ensures-idle contracts to both affected SDKs:

.NET DisposeAsync documentation:

The caller should ensure the session is idle (e.g., SendAndWait has returned) before disposing.

Go Disconnect documentation:

The caller should ensure the session is idle (e.g., Session.SendAndWait has returned) before disconnecting.

Recommendation: Consider adding similar guidance to Node.js/Python, even though deadlock isn't possible there. It would establish a consistent best practice across all SDKs.

✅ Test Coverage Consistency

New tests are appropriate and SDK-specific:

SDK New Tests Purpose
.NET Should_Receive_Session_Events (modified) Validates serial dispatch (concurrent count ≤ 1)
.NET Handler_Exception_Does_Not_Halt_Event_Delivery Exception resilience
.NET DisposeAsync_From_Handler_Does_Not_Deadlock Dispose safety
Go events are dispatched serially Validates serial dispatch (max concurrent = 1)
Go handler panic does not halt delivery Panic recovery

Node.js and Python don't need equivalent tests because concurrency isn't possible on single-threaded event loops.


Summary

No consistency issues found. This PR correctly implements platform-appropriate solutions to a concurrency problem that only affects multi-threaded SDKs. The behavioral guarantees (serial dispatch, handler exceptions don't halt delivery) are now consistent across all four SDKs, even though the implementation mechanisms differ based on runtime architecture.

The PR is well-designed and maintains the project's cross-language consistency goals. ✅

Generated by SDK Consistency Review Agent for issue #791 ·

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review ✅

I've reviewed PR #791 for consistency across all four SDK implementations. Overall: This PR successfully brings .NET and Go into alignment with the existing behavior in Node.js and Python.

Summary of Changes

This PR addresses a runtime-specific concurrency issue in .NET and Go:

  • .NET: Added Channel(SessionEvent) with single-reader consumer loop to serialize event dispatch
  • Go: Added event channel with single goroutine consumer to serialize event dispatch
  • Node.js: ✅ Already serial (JavaScript event loop is single-threaded)
  • Python: ✅ Already serial (asyncio event loop is single-threaded)

The implementations now provide consistent event ordering guarantees across all languages.

✅ What's Consistent

  1. Event dispatch serialization: All SDKs now guarantee handlers see events one at a time, in arrival order
  2. Fire-and-forget broadcast handling: All SDKs handle tool/permission requests asynchronously without blocking event delivery
  3. Panic/exception recovery: All SDKs recover from handler failures and continue processing events
  4. Caller-ensures-idle contract: Both .NET and Go now document that callers should wait for sessions to be idle before disposing/disconnecting

📝 Documentation Opportunity (Optional Enhancement)

The .NET SDK now explicitly documents the serial dispatch guarantee:

"Handlers are invoked serially in event-arrival order on a background thread. A handler will never be called concurrently with itself or with other handlers on the same session."

Suggestion: Consider adding similar documentation to the Node.js and Python SDKs' on() method JSDoc/docstrings. While the behavior is inherent to their single-threaded runtimes, explicitly documenting it would:

  • Help developers understand the consistency guarantee across all SDKs
  • Make the behavior contract explicit rather than implicit
  • Match the level of detail now present in .NET/Go

Similarly, the "caller-ensures-idle" contract added to .NET DisposeAsync and Go Disconnect docs could be added to Node.js disconnect() and Python disconnect() for consistency.

This is purely a documentation enhancement and not a blocker—the actual behavior is already correct and consistent.


Verdict: ✅ This PR maintains excellent cross-SDK consistency. The changes are appropriate and well-tested. Nice work!

Generated by SDK Consistency Review Agent for issue #791 ·

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review ✅

I've reviewed this PR for consistency across all four SDK implementations (Node.js/TypeScript, Python, Go, and .NET).

Summary

This PR introduces serial event dispatch to the .NET and Go SDKs to fix concurrency issues. The changes are consistent with the existing architecture of Node.js and Python, which are inherently single-threaded and don't experience these problems.

What Changed

Go SDK:

  • Removed go keyword from executeToolAndRespond and executePermissionAndRespond calls in handleBroadcastEvent
  • Broadcast handlers now run inline on the calling goroutine (moved the go keyword up one level to handleBroadcastEvent itself)
  • User event handlers remain synchronous on the JSON-RPC read loop, maintaining serial delivery guarantees
  • Added tests: "events are dispatched serially" and "handler panic does not halt delivery"
  • Updated Disconnect docs with caller-ensures-idle contract

.NET SDK:

  • Introduced a Channel(SessionEvent) with single-reader consumer loop (ProcessEventsAsync)
  • Changed HandleBroadcastEventAsync from async void to async Task so work can be awaited
  • Broadcast work is fire-and-forget via HandleBroadcastEventSafe (preserves existing behavior for multi-client scenarios)
  • User event handlers invoked serially by the consumer loop
  • Added ILogger support with structured logging via [LoggerMessage] attributes
  • Added tests: concurrent count tracking in existing test, "Handler_Exception_Does_Not_Halt_Event_Delivery", "DisposeAsync_From_Handler_Does_Not_Deadlock"
  • Updated DisposeAsync docs with caller-ensures-idle contract

Cross-SDK Consistency Analysis

Node.js & Python — No changes needed:

  • Both are single-threaded (event loop/asyncio) and naturally serialize event dispatch
  • Node.js: _dispatchEvent synchronously iterates handlers
  • Python: _dispatch_event synchronously iterates handlers with a lock
  • Broadcast handlers use fire-and-forget (asyncio.ensure_future in Python, implicit promise in Node.js)

Go & .NET now match Node.js/Python behavior:

  • Serial user event handlers: All four SDKs now guarantee handlers see events one at a time, in order
  • Fire-and-forget broadcast work: Tool/permission handlers execute asynchronously without blocking event delivery (important for multi-client scenarios)
  • Panic/exception recovery: All SDKs recover from handler errors and continue processing subsequent events
  • Dispose/disconnect semantics: All SDKs now document the caller-ensures-idle contract

Design Decisions Confirmed Across All SDKs

  1. Full serialization — User event handlers are never concurrent with themselves
  2. Broadcast handlers don't block delivery — Tool/permission work is fire-and-forget to support multi-client scenarios where one handler may intentionally stall
  3. Handler exceptions are isolated — One handler's failure doesn't prevent others from seeing the event
  4. Caller-ensures-idle contract — All SDKs now document that dispose/disconnect should be called when the session is idle

Recommendation

No cross-SDK changes needed. This PR appropriately brings .NET and Go into alignment with the existing event delivery semantics of Node.js and Python. The implementation approaches differ due to language idioms (.NET uses Channels, Go uses goroutines, Node.js/Python use event loops), but the public API guarantees are now consistent across all four SDKs.

The PR description explicitly acknowledges this: "Node.js and Python are single-threaded (event loop / asyncio) and do not have these issues."

Generated by SDK Consistency Review Agent for issue #791 ·

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review ✅

I've reviewed this PR for consistency across all four SDK implementations. No consistency issues found — the changes appropriately address concurrency concerns that only exist in .NET and Go.

Summary

This PR serializes event dispatch in the .NET and Go SDKs to prevent concurrent handler execution. The changes are well-justified and SDK-appropriate:

✅ .NET Changes: Uses a _dispatchLock to serialize user event handler invocations, preventing concurrent execution that could occur due to StreamJsonRpc's ThreadPool dispatch without SynchronizationContext.

✅ Go Changes: Broadcast work (handleBroadcastEvent) now runs in a separate goroutine to avoid deadlocking the JSON-RPC read loop, while broadcast handlers (executeToolAndRespond, executePermissionAndRespond) run inline to maintain serial execution.

✅ Node.js & Python: Correctly not modified — both are single-threaded (event loop / asyncio) and already guarantee serial event dispatch by design. Making similar changes would be unnecessary and potentially harmful to their performance characteristics.

Why This is Consistent

The four SDKs handle concurrency differently based on their language/runtime:

SDK Event Dispatch Model Concurrency Model
Node.js Single-threaded event loop Inherently serial
Python Single-threaded asyncio Inherently serial
Go goroutines + JSON-RPC read loop Requires explicit serialization
.NET ThreadPool dispatch (no SyncContext) Requires explicit serialization

The PR correctly recognizes that behavioral consistency (serial event dispatch) requires different implementation approaches in different languages. Node.js and Python get serial dispatch "for free" from their single-threaded runtimes, while Go and .NET need explicit synchronization mechanisms.

Testing Consistency

All SDKs that needed changes include appropriate tests:

  • .NET: Should_Receive_Session_Events (modified to verify serial dispatch), Handler_Exception_Does_Not_Halt_Event_Delivery, DisposeAsync_From_Handler_Does_Not_Deadlock
  • Go: events are dispatched serially, handler panic does not halt delivery
  • Node.js/Python: No new tests needed (already single-threaded)

Documentation Consistency

The PR adds "caller-ensures-idle" documentation to both .NET (DisposeAsync) and Go (Disconnect) methods. This is appropriate since only these SDKs have concurrency concerns. Node.js and Python dispose methods already have simpler guarantees due to their single-threaded nature.


Conclusion: This PR maintains cross-SDK behavioral consistency (serial event dispatch with resilient error handling) while respecting each language's concurrency model. The changes are appropriate and well-tested. ✨

Generated by SDK Consistency Review Agent for issue #791 ·

@stephentoub stephentoub marked this pull request as draft March 11, 2026 22:16
auto-merge was automatically disabled March 11, 2026 22:16

Pull request was converted to draft

@github-actions
Copy link
Contributor

Cross-SDK Consistency Review: Event Serialization

I've reviewed this PR for consistency across all four SDK implementations. Here are my findings:

Correctly Addresses: .NET and Go

The PR properly fixes concurrency issues in .NET and Go where event handlers could run:

  • Concurrently with themselves
  • Out of order
  • With untracked broadcast work

⚠️ Potential Inconsistency: Node.js SDK

The PR description states:

"Node.js and Python are single-threaded (event loop / asyncio) and do not have these issues."

This is only partially correct:

Python SDK: ✅ Safe

  • Handlers are typed as Callable[[SessionEvent], None] (strictly synchronous)
  • Cannot use async functions as handlers without explicit awaiting
  • Events truly dispatch serially with no interleaving

Node.js SDK: ⚠️ Has Similar Issues

While Node.js runs on a single-threaded event loop, async handlers create the same concurrency problems this PR fixes:

Current behavior (nodejs/src/session.ts:293-317):

_dispatchEvent(event: SessionEvent): void {
    // Fire-and-forget - async handlers are NOT awaited
    for (const handler of this.eventHandlers) {
        try {
            handler(event);  // If handler is async, creates unresolved Promise
        } catch (_error) { }
    }
}

Handler type (nodejs/src/types.ts):

export type SessionEventHandler = (event: SessionEvent) => void;
// This accepts: async (event) => { ... }
// Which returns Promise(void), discarded by the caller

Problems:

  1. Concurrent execution: If users write session.on(async (event) => { await doWork(); }), multiple handlers can run concurrently when events arrive quickly
  2. Out-of-order completion: Async handlers complete in non-deterministic order
  3. Fire-and-forget broadcast: Same issue as .NET's old async void HandleBroadcastEventAsync—broadcast work (_handleBroadcastEvent) runs untracked

Example failure scenario:

let count = 0;
session.on(async (event) => {
    const current = count++;
    await someAsyncWork();  // Yields to event loop
    count = current + 1;    // Race condition!
});

📋 Recommendation

For full cross-SDK consistency, consider addressing Node.js in a follow-up PR:

  1. Serial async handler dispatch: Use a queue-drain pattern similar to .NET's DrainEventQueue, or sequential await for async handlers
  2. Document handler semantics: Clarify whether handlers should be sync-only (like Python) or support async with serial dispatch (like .NET/Go after this PR)
  3. Await broadcast work: Change _handleBroadcastEvent to return Promise(void) and await it before invoking user handlers

🎯 Decision Points

  • If Node.js handlers should remain fire-and-forget: Document this behavior explicitly and note the difference from .NET/Go
  • If Node.js should match .NET/Go: Apply similar serialization fixes

Overall assessment: This PR is excellent and correctly fixes real concurrency bugs in .NET and Go. The Node.js SDK has a similar latent issue that should be evaluated for consistency.

Generated by SDK Consistency Review Agent for issue #791 ·

@github-actions
Copy link
Contributor

SDK Consistency Review: Serial Event Dispatch

I've reviewed PR #791 for cross-SDK consistency. Here are my findings:

✅ Changes are Appropriately Scoped

This PR correctly targets only .NET and Go SDKs to fix concurrency issues that are specific to multi-threaded runtimes. The PR description explicitly notes:

Node.js and Python are single-threaded (event loop / asyncio) and do not have these issues.

Analysis by SDK

.NET Changes ✅

  • Implements a Channel(SessionEvent) with single-reader consumer loop (ProcessEventsAsync)
  • Ensures serial event handler dispatch (never concurrent)
  • Adds logging for dropped events and handler exceptions
  • Tests validate:
    • Serial dispatch (max concurrent = 1)
    • Handler exceptions don't halt delivery
    • No deadlock when DisposeAsync called from handler

Go Changes ✅

  • Removed go keyword from executeToolAndRespond and executePermissionAndRespond
  • Broadcast work runs inline on read loop goroutine (still in separate goroutine from user handlers)
  • User event handlers remain synchronous on calling goroutine
  • Tests validate:
    • Serial dispatch (max concurrent = 1)
    • Handler panic recovery

Node.js & Python: Unchanged ✅

Both SDKs are inherently serial due to single-threaded event loop semantics:

  • Node.js: JavaScript event loop guarantees handlers run to completion before next task
  • Python: asyncio event loop + GIL provides similar guarantees

Their _dispatchEvent methods iterate synchronously through handlers, which is sufficient for single-threaded runtimes.

Semantic Consistency Across All SDKs

After these changes, all four SDKs now guarantee:

  1. ✅ Event handlers are invoked serially (never concurrent)
  2. ✅ Event handlers see events in arrival order
  3. ✅ Handler exceptions/panics are caught and logged (don't crash dispatcher)
  4. ✅ Broadcast work (tool/permission handlers) is fire-and-forget but tracked appropriately

Documentation Consistency

The PR adds caller-ensures-idle contract documentation to .NET (DisposeAsync) and Go (Disconnect). Consider adding equivalent notes to:

  • Node.js disconnect() method
  • Python disconnect() method

This would ensure users of all SDKs understand the same contract about session lifecycle.

Recommendation

Approve — The changes maintain appropriate cross-SDK consistency while respecting the idioms of each runtime. The serial dispatch guarantee is now explicit in all four implementations, achieved through different mechanisms appropriate to each platform:

  • .NET: Channel-based async queue
  • Go: Synchronous dispatch on single goroutine
  • Node.js: Event loop serialization
  • Python: Event loop serialization

The only minor suggestion is to sync the "caller-ensures-idle" documentation across all SDKs for complete consistency.

Generated by SDK Consistency Review Agent for issue #791 ·

@stephentoub stephentoub force-pushed the serial-event-dispatch branch from 1657f9c to 7ac0b8a Compare March 12, 2026 02:03
@stephentoub stephentoub force-pushed the serial-event-dispatch branch from 7ac0b8a to 819a771 Compare March 12, 2026 02:05
@github-actions
Copy link
Contributor

Cross-SDK Consistency Review: Event Dispatch Serialization

I've reviewed PR #791 for cross-SDK consistency. This PR fixes concurrent event handler execution issues in .NET and Go, but there's an important inconsistency with the Python SDK that should be addressed.


✅ What This PR Fixes

.NET Changes:

  • Introduces a Channel(SessionEvent) with a single-reader consumer loop (ProcessEventsAsync) to serialize all event dispatch
  • Changes HandleBroadcastEventAsync from async void (fire-and-forget) to async Task (awaited inline)
  • Ensures broadcast work (tool/permission handlers) completes before user handlers see events
  • Adds comprehensive tests for serial dispatch and exception handling

Go Changes:

  • Removes go keyword from executeToolAndRespond and executePermissionAndRespond calls
  • Broadcast work now runs inline on the read loop goroutine (serialized)
  • Adds similar tests for serial dispatch and panic recovery

⚠️ Consistency Issue: Python SDK

The PR description states:

"Node.js and Python are single-threaded (event loop / asyncio) and do not have these issues."

This is only partially correct. While Node.js is truly serialized due to its single-threaded event loop, Python has the same concurrency issue that Go had before this PR.

Evidence from Python SDK (python/copilot/session.py:292-296, 309-311):

# In _handle_broadcast_event:
asyncio.ensure_future(
    self._execute_tool_and_respond(
        request_id, tool_name, tool_call_id, arguments, handler
    )
)

# And for permissions:
asyncio.ensure_future(
    self._execute_permission_and_respond(request_id, permission_request, perm_handler)
)

asyncio.ensure_future() schedules coroutines to run concurrently - it's fire-and-forget task scheduling, exactly like Go's go keyword was before this PR. This means:

  1. Multiple tool handlers can execute concurrently if multiple external_tool.requested events arrive
  2. Permission handlers can run concurrently with tool handlers
  3. User event handlers see events before broadcast work completes (same race condition .NET/Go had)
  4. No ordering guarantee for completion of broadcast work

📊 Current State Comparison

SDK User Event Handlers Broadcast Handlers (tools/permissions) Serial Guarantee
Node.js ✅ Serial (sync loop) ✅ Serial (fire-and-forget async, but event loop serializes) ✅ Yes
.NET (after PR) ✅ Serial (Channel consumer) ✅ Serial (awaited inline by consumer) ✅ Yes
Go (after PR) ✅ Serial (event channel + processEvents) ✅ Serial (inline execution) ✅ Yes
Python (current) ✅ Serial (sync loop) Concurrent (asyncio.ensure_future) No

💡 Recommendation

To maintain feature parity and consistent behavior across all SDKs, the Python SDK should adopt the same serialization approach:

Option 1: Inline await (like .NET post-PR)

async def _handle_broadcast_event(self, event: SessionEvent) -> None:
    if event.type == SessionEventType.EXTERNAL_TOOL_REQUESTED:
        # ... validation ...
        await self._execute_tool_and_respond(...)  # Await inline instead of ensure_future

Option 2: Event queue with single consumer (like Go post-PR)

  • Use asyncio.Queue to serialize all handler dispatch
  • Single consumer task processes events in FIFO order
  • Guarantees serial execution and ordering

Suggested approach: Option 1 is simpler and matches .NET's design. The Python SDK should:

  1. Change _handle_broadcast_event to async
  2. Replace asyncio.ensure_future() with await
  3. Make _dispatch_event async and await _handle_broadcast_event
  4. Add tests similar to .NET/Go for serial dispatch validation

🎯 Summary

  • This PR correctly fixes .NET and Go
  • Node.js doesn't need changes (inherently serial) ✅
  • Python needs the same fix to maintain consistency ⚠️

The serialization guarantee is an important semantic property that users will rely on - having different behavior in Python creates a cross-platform consistency issue and potential bugs when code is ported between SDKs.

Would you like me to create a follow-up issue to track the Python serialization work, or should it be addressed in this PR?

Generated by SDK Consistency Review Agent for issue #791 ·

@stephentoub
Copy link
Collaborator Author

Re: Cross-SDK Consistency Review

This response was generated by GitHub Copilot CLI.

The consistency review above analyzed a stale version of the PR. The current implementation fires broadcast handlers as fire-and-forget in all four SDKs, which is the correct design. The review's recommendation to \�wait\ broadcast handlers in Python would reintroduce the deadlock/blocking problems this PR fixes.

Actual state after this PR

SDK User handlers Broadcast handlers
Node.js Serial (sync loop) Fire-and-forget (\�oid this._executeToolAndRespond())
Python Serial (sync loop) Fire-and-forget (\�syncio.ensure_future())
Go Serial (channel consumer) Fire-and-forget (\go s.handleBroadcastEvent())
.NET Serial (channel consumer) Fire-and-forget (_ = HandleBroadcastEventAsync())

All four SDKs are consistent. Broadcast handlers must run concurrently/fire-and-forget because:

  1. Deadlock prevention — in Go, the read loop goroutine calls \dispatchEvent. If a broadcast handler issues an RPC request (e.g., \HandlePendingToolCall), the response can only be read by the read loop — but it's blocked waiting for the handler. Inline execution deadlocks.

  2. Multi-client scenarios — in multi-client tests, one client's permission handler intentionally never completes (\select {}\ in Go,
    ew TaskCompletionSource().Task\ in .NET). If broadcast handlers block event delivery, the second client never sees events.

The review's table incorrectly shows .NET and Go as having serial/inline broadcast handlers. That was an earlier iteration that was reverted for exactly these reasons.

@stephentoub stephentoub force-pushed the serial-event-dispatch branch from 819a771 to 4d1d57c Compare March 12, 2026 02:22
@github-actions
Copy link
Contributor

✅ Cross-SDK Consistency Review: APPROVED

This PR successfully brings .NET and Go event dispatch in line with Node.js and Python. No consistency issues found.

Core Achievement

All four SDKs now guarantee:

  • Serial event handler execution (handlers never run concurrently)
  • FIFO ordering (events delivered in arrival order)
  • Fire-and-forget broadcast handlers (tool calls, permission requests)
  • Deadlock prevention (broadcast handlers can issue RPCs safely)
  • Non-blocking transport (reader never waits for user code)

Implementation Comparison

SDK User Event Handlers Broadcast Handlers
Node.js Inline sync (event loop serializes) void fire-and-forget
Python Inline sync (with lock) asyncio.ensure_future
Go (this PR) Channel → goroutine consumer go goroutine
.NET (this PR) Channel → task consumer _ = fire-and-forget task

Language-Appropriate Adaptations

The following differences are idiomatic and appropriate for each language:

  1. Error Logging:

    • Node.js: Silent catch blocks
    • Python: print() to stderr
    • Go: fmt.Printf() to stdout
    • .NET: ILogger structured logging ✅
  2. Channel Sizing:

    • Go: Buffered (128 capacity, prevents unbounded growth)
    • .NET: Unbounded (simpler, relies on CLI backpressure)

Test Coverage

Both .NET and Go added comprehensive tests:

  • ✅ Serial dispatch verification
  • ✅ Handler exception/panic recovery
  • ✅ Deadlock prevention
  • ✅ FIFO ordering

Public API Impact

None. All changes are internal implementation details. The public API surface remains consistent across all four SDKs.


Recommendation: Approve and merge. This PR successfully maintains cross-SDK consistency while implementing critical fixes for .NET and Go concurrency issues.

Generated by SDK Consistency Review Agent for issue #791 ·

@stephentoub stephentoub force-pushed the serial-event-dispatch branch from 4d1d57c to 74f0a0d Compare March 12, 2026 03:02
In .NET, StreamJsonRpc dispatches notifications concurrently on the
thread pool. The old code invoked user event handlers inline from
DispatchEvent, which meant handlers could run concurrently and out of
order.

In Go, the JSON-RPC read loop is single-threaded, so user handlers were
already serialized. However, broadcast handlers (tool calls, permission
requests) ran inline on the read loop, which deadlocked when a handler
issued an RPC request back through the same connection.

This PR decouples user handler dispatch from the transport by routing
events through a channel (Go) / Channel<T> (.NET). A single consumer
goroutine/task drains the channel and invokes user handlers serially, in
FIFO order. This matches the guarantees provided by the Node.js and
Python SDKs (which get natural serialization from their single-threaded
event loops) while fitting Go's and .NET's multi-threaded runtimes.

Broadcast handlers (tool calls, permission requests) are fired as
fire-and-forget directly from the dispatch entry point, outside the
channel, so a stalled handler cannot block event delivery. This matches
the existing Node.js (void this._executeToolAndRespond()) and Python
(asyncio.ensure_future()) behavior.

Go changes:
- Add eventCh channel to Session; start processEvents consumer goroutine
- dispatchEvent enqueues to channel and fires broadcast handler goroutine
- Close channel on Disconnect to stop the consumer
- Update unit tests and E2E tests for async delivery

.NET changes:
- Add unbounded Channel<SessionEvent> to CopilotSession; start
  ProcessEventsAsync consumer task in constructor
- DispatchEvent enqueues to channel and fires broadcast handler task
- Complete channel on DisposeAsync
- Per-handler error catching via ImmutableArray iteration
- Cache handler array snapshot to avoid repeated allocation
- Inline broadcast error handling into HandleBroadcastEventAsync
- Update Should_Receive_Session_Events test to await async delivery
- Add Handler_Exception_Does_Not_Halt_Event_Delivery test
- Add DisposeAsync_From_Handler_Does_Not_Deadlock test

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@stephentoub stephentoub force-pushed the serial-event-dispatch branch from 74f0a0d to e320690 Compare March 12, 2026 03:14
@stephentoub stephentoub marked this pull request as ready for review March 12, 2026 03:20
@stephentoub stephentoub enabled auto-merge March 12, 2026 03:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants