Opt in voice mode support with /multimodal endpoint by pranavjoshi001 · Pull Request #445 · microsoft/BotFramework-DirectLineJS

pranavjoshi001 · 2025-11-27T11:39:23Z

Description

This change adds voice mode support to DirectLineJS with a client opt-in mechanism via the enableVoiceMode option. When enabled, it allows audio streaming through WebSocket connections using the /stream/multimodal endpoint.

Background

DirectLineJS currently routes all activities through HTTP POST instead of WebSocket due to limitations on the ABS side.
ABS does not process incoming WebSocket traffic; it only supports server-to-client push.
Voice traffic is not supported over HTTP POST and must be sent through WebSocket instead of API calls.
This PR introduces an opt-in voice mode that uses the /stream/multimodal endpoint and routes all traffic through WebSocket when enabled.

Changes in this PR

Added enableVoiceMode option:
- true → Enables voice mode
- false → Disables voice mode
- undefined → Auto-detects iframe microphone permission
Enhanced stream URL:
- Voice mode → /stream/multimodal
- Standard mode → /stream
Modified activity routing:
- Voice mode → Sends all activities (text + voice) via WebSocket
- Standard mode → Uses HTTP POST
Added server capabilities handling:
- Parses agent.capabilities event with modalities object to detect audio support
Added new public methods:
- getIsVoiceModeEnabled()
- getVoiceConfiguration()
- addEventListener()
- removeEventListener()
Added test coverage:
- Explicit enableVoiceMode: true/false
- Auto-detect in iframe
- WebSocket vs HTTP routing verification
- Reconnect behavior
- 403 retry handling
- agent.capabilities event handling

Backward Compatibility

No breaking changes:
- Default behavior (enableVoiceMode: undefined in non-iframe context) maintains existing HTTP POST flow
Opt-in only:
- Voice mode must be explicitly enabled or auto-detected in iframe with microphone permission

src/directLine.test.ts

src/directLine.ts

compulim

Commented.

post voice traffic only to socket

f7a7d21