Skip to content

Conversation

@isabelle-cedar
Copy link
Contributor

No description provided.

@isabelle-cedar isabelle-cedar marked this pull request as draft August 23, 2025 21:11
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Summary

This PR implements voice streaming functionality for Cedar OS, specifically targeting Mastra backend providers. The implementation adds the capability to process audio responses in real-time as they arrive from the backend, rather than waiting for complete responses.

The core changes introduce a new voiceStreamLLM method to the agent connection architecture, following the established streaming pattern used for text responses. The implementation adds new types (VoiceStreamEvent, VoiceStreamHandler) to handle various voice-specific events including transcription updates, audio chunks, and structured objects. The voice slice is enhanced with a new stream boolean configuration setting that determines whether to use streaming or traditional voice processing.

Key architectural additions include:

  • Provider abstraction: The ProviderImplementation interface gains an optional voiceStreamLLM method, maintaining backward compatibility
  • Event handling: Comprehensive event processing for different voice stream event types (transcription, audio, chunk, object, done, error)
  • Dual-path implementation: The agent connection slice intelligently detects provider capabilities and falls back to non-streaming voice processing when streaming isn't supported
  • Mastra integration: Specific implementation for Mastra providers with proper URL construction, header management, and event transformation

The implementation maintains consistency with the existing Cedar OS streaming architecture while extending it to support voice use cases. Helper functions were extracted in the Mastra provider to promote code reuse between streaming and non-streaming voice methods. The voice slice introduces a handled flag mechanism to prevent duplicate processing when responses contain multiple data types.

Confidence score: 4/5

  • This PR introduces complex streaming logic but follows established patterns and includes comprehensive error handling
  • Score reflects well-structured implementation with proper fallback mechanisms and backward compatibility
  • Pay close attention to the event handling logic in voiceStreamLLM and the dual-path processing in the voice slice

4 files reviewed, 3 comments

Edit Code Review Bot Settings | Greptile

Comment on lines +544 to +549
if (response.audioData || response.audioUrl) {
wrappedHandler({
type: 'audio',
audioData: response.audioData || response.audioUrl || '',
audioFormat: response.audioFormat,
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Audio data fallback logic uses audioData || audioUrl || '' which could result in empty string for audio data if both are undefined

try {
const headers = createVoiceHeaders(config);
const baseUrl = resolveVoiceEndpoint(params.voiceSettings, config);
const streamUrl = `${baseUrl}/stream`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: appending '/stream' to baseUrl could create malformed URLs if baseUrl already ends with '/stream' or has query parameters

Comment on lines +288 to +289
// Voice processing completed successfully (streaming or non-streaming)
get().setIsProcessing(false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Processing state is cleared after streaming completion, but error handling at line 296 also clears it. Consider moving the success case inside a try block to ensure consistent state management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants