Skip to content

Conversation

@ajbmachon
Copy link

Problem

When launching multiple subagents simultaneously (a common pattern in PAI workflows), each agent sends a voice notification as it starts. Since these notifications arrive at the VoiceServer nearly simultaneously, multiple afplay processes spawn concurrently, resulting in garbled, overlapping audio output.

Solution

This PR adds a server-side FIFO message queue that serializes voice playback:

  1. Sequence numbers captured before any async operations guarantee arrival order
  2. Single-threaded processing ensures only one message plays at a time
  3. Insertion sort maintains order when messages arrive during playback
  4. Queue overflow protection (max 100 messages) prevents memory exhaustion
  5. try/finally guard ensures processing flag resets even on errors

Changes

  • Added QueuedMessage interface and queue state management
  • Added drainQueueInOrder() and processQueue() functions
  • Updated /notify and /pai endpoints to enqueue instead of play directly
  • API response now includes queue_position and queue_depth
  • /health endpoint shows queue status

Testing

Tested with concurrent curl requests - messages now play sequentially in arrival order.


Thank you for PAI - it's been incredibly useful. Happy to adjust anything based on your feedback.

ajbmachon and others added 2 commits January 25, 2026 11:49
When multiple subagents start tasks simultaneously, they all send voice
notifications at once, resulting in garbled overlapping audio output.

This adds a server-side FIFO message queue that:
- Captures sequence numbers before async operations to guarantee ordering
- Processes messages one at a time (sequential playback)
- Uses insertion sort to maintain order
- Includes queue overflow protection (max 100 messages)
- Uses try/finally to ensure processing flag is always reset
- Exposes queue status in /health endpoint

API response now includes queue_position and queue_depth for visibility.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Prevents memory leak by removing stale rate limit entries every 5 minutes.
Without this, the requestCounts map grows indefinitely in long-running servers.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant