Skip to content

Feat/server side wakeupword#66

Merged
74th merged 12 commits intodevelopfrom
feat/server-side-wakeupword
May 10, 2026
Merged

Feat/server side wakeupword#66
74th merged 12 commits intodevelopfrom
feat/server-side-wakeupword

Conversation

@74th
Copy link
Copy Markdown
Owner

@74th 74th commented May 9, 2026

This pull request introduces comprehensive support for server-side wake word detection in the StackChan system. It adds new protocol messages, updates the state machine and display logic to handle a new ServerWwd state, and provides environment variable and documentation updates for configuring and using server-side wake word detection with Whisper Server. The changes ensure that StackChan can now stream microphone audio specifically for wake word detection, and both the firmware and server can negotiate and handle this mode.

Server-side Wake Word Detection Support

  • Added new message kind MESSAGE_KIND_SERVER_WWD_PCM for dedicated PCM uplink streams used in server-side wake word detection (firmware/lib/generated_protobuf/websocket-message.pb.h, docs/websocket_protocols_ja.md). [1] [2] [3]
  • Introduced new state ServerWwd in the state machine, including protocol, display, and documentation updates to handle this state and its transitions (firmware/include/state_machine.hpp, firmware/lib/generated_protobuf/websocket-message.pb.h, firmware/src/display.cpp, AGENTS.md, docs/websocket_protocols_ja.md). [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

Firmware Enhancements

  • Updated Listening class to support two session modes (Speech and WakeWord), with new methods for starting/stopping wake word streaming and correct handling of silence auto-stop logic (firmware/include/listening.hpp, firmware/src/listening.cpp). [1] [2] [3] [4] [5] [6] [7]
  • Added shouldUseServerWakeWord utility and updated metadata handling to support negotiation of wake word detection capability between server and client (firmware/include/metadata.hpp).

Configuration and Documentation Updates

  • Added new environment variables for configuring both general and wake word-specific Whisper Server endpoints, models, languages, and prompts in .env.template and documented their use in docs/server_ja.md. [1] [2] [3] [4] [5]
  • Expanded protocol and agent documentation to describe the new message types, states, metadata exchange, and the server-side wake word detection flow (docs/websocket_protocols_ja.md, AGENTS.md). [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11]

These changes collectively enable StackChan to flexibly use either device-side or server-side wake word detection, improving versatility and allowing for more advanced or centralized voice activation scenarios.

74th added 12 commits May 9, 2026 14:27
- Introduced `ServerWwdPcm` message kind for server-side wakeword PCM stream.
- Updated WebSocket message protocol to include `MESSAGE_KIND_SERVER_WWD_PCM`.
- Implemented `WhisperServerWakeWordDetector` for handling server-side wakeword detection.
- Refactored `WsProxy` to manage server-side wakeword PCM messages.
- Removed deprecated server-side wakeword detection API endpoint.
- Enhanced documentation for new wakeword detection flow and message types.
@74th 74th marked this pull request as ready for review May 10, 2026 06:45
@74th 74th changed the base branch from main to develop May 10, 2026 06:52
@74th 74th merged commit caf9001 into develop May 10, 2026
3 checks passed
@74th 74th deleted the feat/server-side-wakeupword branch May 10, 2026 07:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant