Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
a716f38
Merge pull request #64 from 74th/main
74th May 9, 2026
ac83a0a
feat: ServerSideWWD
74th May 9, 2026
930fdff
feat: Stop wakeword streaming on WebSocket disconnection
74th May 9, 2026
03d0bab
feat: Add server-side wakeword detection support
74th May 9, 2026
82c09a9
feat: Add language and prompt configuration for Whisper Server
74th May 9, 2026
f318d97
feat: Add WakeWordDetectionTimeout for improved error handling in wak…
74th May 9, 2026
40543d0
feat: Enhance display logic to track server wake word idle state
74th May 9, 2026
7235abf
feat: Update StateCmd handling for server-side wakeword detection
74th May 9, 2026
97ba831
feat: Update ServerWwd state representation to "Idle(Server-WWD)" in …
74th May 9, 2026
5dd2c34
feat: Refactor server wakeword detection methods for consistency and …
74th May 9, 2026
b0e7b70
feat: Implement ServerWwdController for server-side wakeword detectio…
74th May 9, 2026
2fc4c11
feat: Remove has_server_wake_word dependency and streamline auto dete…
74th May 9, 2026
a29aad2
feat: Remove has_pending_wakeword parameter and add suppress_restart …
74th May 9, 2026
caf9001
Merge pull request #66 from 74th/feat/server-side-wakeupword
74th May 10, 2026
8f7e40e
feat: Refactor ServerWwdController initialization and state managemen…
74th May 9, 2026
cc2c84a
Merge pull request #67 from 74th/feat/server-side-wakeupword
74th May 10, 2026
d0ad278
feat: Add workflows for version bump check and release management
74th May 10, 2026
eaa9537
Merge pull request #69 from 74th/feat/release-action
74th May 10, 2026
da6e0a7
bump version
74th May 10, 2026
35dbf5a
Merge pull request #71 from 74th/bump-version
74th May 10, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .env.template
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ STACKCHAN_GOOGLE_CLOUD_STT_LANGUAGE_CODE="ja-JP"
# STACKCHAN_USE_WHISPER_SERVER=1
# STACKCHAN_WHISPER_SERVER_URL="http://127.0.0.1:8080/inference"
# STACKCHAN_WHISPER_SERVER_MODEL=
# STACKCHAN_WHISPER_SERVER_PROMPT=
# STACKCHAN_WHISPER_SERVER_LANGUAGE="ja"

# -- Speech Syntheis --
# Google Cloud TTS
Expand All @@ -34,6 +36,14 @@ STACKCHAN_GOOGLE_CLOUD_TTS_VOICE_NAME="Despina"
STACKCHAN_VOICEVOX_URL="http://localhost:50021"
STACKCHAN_VOICEVOX_SPEAKER=1

# -- Server-side Wakeup Word Detection --
# Whisper Server
# STACKCHAN_USE_WWD_WHISPER_SERVER=1
# STACKCHAN_WWD_WHISPER_SERVER_URL="http://127.0.0.1:8080/inference"
# STACKCHAN_WWD_WHISPER_SERVER_MODEL=
# STACKCHAN_WWD_WHISPER_SERVER_LANGUAGE="ja"
# STACKCHAN_WWD_WHISPER_SERVER_PROMPT="日本語で、スタックチャンという名前で、話しかけらるので、話しかけられたことを検出してください"

# -- Claude Agent SDK --
# using Google Cloud Vertex AI
CLAUDE_CODE_USE_VERTEX=1
Expand Down
75 changes: 75 additions & 0 deletions .github/workflows/check-version-bump-for-main-pr.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
name: Check Version Bump For Main Or Test PR

on:
pull_request:
branches:
- main
- test
types:
- opened
- synchronize
- reopened
- edited
- ready_for_review

permissions:
contents: read

jobs:
check-version-bump:
if: github.head_ref == 'develop'
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.13"

- name: Compare pyproject.toml versions
env:
BASE_REF: ${{ github.base_ref }}
HEAD_REF: ${{ github.head_ref }}
run: |
set -euo pipefail
git fetch origin "${BASE_REF}" --depth=1
python - <<'PY'
import os
import subprocess
import tomllib
from pathlib import Path

def read_version(pyproject_text: str) -> str:
data = tomllib.loads(pyproject_text)
version = data.get("project", {}).get("version")
if not version:
raise SystemExit("project.version not found in pyproject.toml")
return version

base_ref = os.environ["BASE_REF"]
head_ref = os.environ["HEAD_REF"]
base_pyproject = subprocess.check_output(
["git", "show", f"origin/{base_ref}:pyproject.toml"],
text=True,
)
head_pyproject = Path("pyproject.toml").read_text()

base_version = read_version(base_pyproject)
head_version = read_version(head_pyproject)

print(f"Base branch version ({base_ref}): {base_version}")
print(f"PR branch version ({head_ref}): {head_version}")

if base_version == head_version:
print(
f"::error file=pyproject.toml,title=Version bump required::"
f"develop -> {base_ref} PR must update [project].version in pyproject.toml. "
f"{base_ref} is {base_version} and this PR is still {head_version}."
)
raise SystemExit(1)
PY
145 changes: 145 additions & 0 deletions .github/workflows/release-on-main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,145 @@
name: Release On Main Or Test

on:
push:
branches:
- main
- test

permissions:
contents: write
pull-requests: read

concurrency:
group: release-${{ github.ref }}
cancel-in-progress: false

jobs:
release:
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.13"

- name: Read release version from pyproject.toml
id: version
run: |
python - <<'PY'
import os
import tomllib
from pathlib import Path

data = tomllib.loads(Path("pyproject.toml").read_text())
version = data["project"]["version"]

with open(os.environ["GITHUB_OUTPUT"], "a", encoding="utf-8") as fh:
print(f"tag={version}", file=fh)

print(f"Release version: {version}")
PY

- name: Find previous version tag
id: previous
shell: bash
run: |
set -euo pipefail
current_tag="${{ steps.version.outputs.tag }}"
previous_tag="$(git tag --list '[0-9]*' --sort=-version:refname | grep -Fxv "${current_tag}" | head -n 1 || true)"
echo "previous_tag=${previous_tag}" >> "$GITHUB_OUTPUT"
if [ -n "${previous_tag}" ]; then
echo "Previous tag: ${previous_tag}"
else
echo "Previous tag: none"
fi

- name: Create tag if needed
uses: actions/github-script@v7
env:
TAG: ${{ steps.version.outputs.tag }}
TARGET_SHA: ${{ github.sha }}
with:
script: |
const { owner, repo } = context.repo;
const tag = process.env.TAG;
const targetSha = process.env.TARGET_SHA;

try {
await github.rest.git.getRef({
owner,
repo,
ref: `tags/${tag}`,
});
core.notice(`Tag ${tag} already exists.`);
} catch (error) {
if (error.status !== 404) {
throw error;
}

await github.rest.git.createRef({
owner,
repo,
ref: `refs/tags/${tag}`,
sha: targetSha,
});
core.notice(`Created tag ${tag} at ${targetSha}.`);
}

- name: Create GitHub Release if needed
uses: actions/github-script@v7
env:
TAG: ${{ steps.version.outputs.tag }}
PREVIOUS_TAG: ${{ steps.previous.outputs.previous_tag }}
TARGET_SHA: ${{ github.sha }}
with:
script: |
const { owner, repo } = context.repo;
const tag = process.env.TAG;
const previousTag = process.env.PREVIOUS_TAG;
const targetSha = process.env.TARGET_SHA;

try {
const existing = await github.rest.repos.getReleaseByTag({
owner,
repo,
tag,
});
core.notice(`Release for ${tag} already exists: ${existing.data.html_url}`);
return;
} catch (error) {
if (error.status !== 404) {
throw error;
}
}

const notes = await github.request(
"POST /repos/{owner}/{repo}/releases/generate-notes",
{
owner,
repo,
tag_name: tag,
target_commitish: targetSha,
...(previousTag ? { previous_tag_name: previousTag } : {}),
},
);

const release = await github.rest.repos.createRelease({
owner,
repo,
tag_name: tag,
target_commitish: targetSha,
name: tag,
body: notes.data.body,
draft: false,
prerelease: false,
generate_release_notes: false,
});

core.notice(`Created release ${release.data.html_url}`);
7 changes: 4 additions & 3 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@

## 状態遷移の要点

- ファームウェア状態: `Idle`, `Listening`, `Thinking`, `Speaking`, `Disconnected`
- サーバーから指示できるのは `StateCmd` の `Idle` / `Listening` / `Thinking` / `Speaking`
- ファームウェア状態: `Idle`, `Listening`, `Thinking`, `Speaking`, `ServerWwd`, `Disconnected`
- サーバーから指示できるのは `StateCmd` の `Idle` / `Listening` / `Thinking` / `Speaking` / `ServerWwd`
- `Disconnected` はファームウェア内部状態で、WebSocket 切断時に入る
- `WakeWordEvt` を受けるか、REST API の wakeword 擬似発火で talk session が始まる

Expand Down Expand Up @@ -75,6 +75,7 @@
- `websocket.client.host` を StackChan の識別子として使う
- 同一 IP の再接続時は既存接続を置き換える
- `listen()` は `Listening` 指示後、音声 uplink 完了を待つ
- サーバーサイド wakeword 検出中は `ServerWwd` を指示する
- `speak()` は TTS downlink 送信後、`SpeakDoneEvt` を待つ
- `move_servo()` / `wait_servo_complete()` を公開

Expand Down Expand Up @@ -106,7 +107,7 @@
- `MoveX`, `MoveY`, `Sleep` を順次処理
- 完了時に `ServoDoneEvt`
- `src/display.cpp`
- `Idle=濃いグレー`, `Listening=青`, `Thinking=オレンジ`, `Speaking=緑`, `Disconnected=赤`
- `Idle=濃いグレー`, `Listening=青`, `Thinking=オレンジ`, `Speaking=緑`, `ServerWwd=Idle(Server-WWD)`, `Disconnected=赤`

## サンプルアプリ

Expand Down
26 changes: 26 additions & 0 deletions docs/server_ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,8 @@ STACKCHAN_WHISPER_CLI_VAD_MODEL_PATH="/path/to/whisper.cpp/ggml-silero-v5.1.2.bi

`STACKCHAN_WHISPER_SERVER_URL` に Whisper Server の推論エンドポイント URL をそのまま指定します。
未設定時は `http://127.0.0.1:8080/inference` を利用します。
`STACKCHAN_WHISPER_SERVER_LANGUAGE` を設定すると、その値を `language` パラメータとして各リクエストに含めます。未設定または空文字の場合は `language` を送信しません。
また、`STACKCHAN_WHISPER_SERVER_PROMPT` を設定すると、whisper-server の各リクエストに `prompt` フィールドとして送信します。

#### 例: Whisper.cppのwhisper-serverの設定

Expand All @@ -74,6 +76,8 @@ whisper.cpp/examples/server: https://github.com/ggml-org/whisper.cpp/tree/master
STACKCHAN_USE_WHISPER_SERVER=1
STACKCHAN_WHISPER_SERVER_URL="http://127.0.0.1:8080/inference"
STACKCHAN_WHISPER_SERVER_MODEL=
STACKCHAN_WHISPER_SERVER_LANGUAGE="ja"
STACKCHAN_WHISPER_SERVER_PROMPT=""
```

#### 例: [Lemonade](https://lemonade-server.ai/) を使う場合
Expand All @@ -84,6 +88,28 @@ Lemonade: https://lemonade-server.ai/
STACKCHAN_USE_WHISPER_SERVER=1
STACKCHAN_WHISPER_SERVER_URL=http://localhost:13305/api/v1/audio/transcriptions
STACKCHAN_WHISPER_SERVER_MODEL=Whisper-Large-v3-Turbo
STACKCHAN_WHISPER_SERVER_LANGUAGE="ja"
STACKCHAN_WHISPER_SERVER_PROMPT=""
```

### (オプション) サーバーサイド wakeword 用 Whisper Server の設定

サーバーサイド wakeword 検出を有効にするには、以下を設定します。

- `STACKCHAN_USE_WWD_WHISPER_SERVER`: `1`
- `STACKCHAN_WWD_WHISPER_SERVER_URL`: wakeword 検出専用 Whisper Server の推論エンドポイント URL
- `STACKCHAN_WWD_WHISPER_SERVER_MODEL`: wakeword 検出専用に利用するモデル名
- `STACKCHAN_WWD_WHISPER_SERVER_LANGUAGE`: wakeword 検出専用 Whisper Server リクエストへ渡す language
- `STACKCHAN_WWD_WHISPER_SERVER_PROMPT`: wakeword 検出専用 Whisper Server リクエストへ渡す prompt

通常の音声認識で使う `STACKCHAN_WHISPER_SERVER_URL` / `STACKCHAN_WHISPER_SERVER_MODEL` とは別設定です。

```
STACKCHAN_USE_WWD_WHISPER_SERVER=1
STACKCHAN_WWD_WHISPER_SERVER_URL="http://127.0.0.1:8080/inference"
STACKCHAN_WWD_WHISPER_SERVER_MODEL=
STACKCHAN_WWD_WHISPER_SERVER_LANGUAGE="ja"
STACKCHAN_WWD_WHISPER_SERVER_PROMPT="日本語で、スタックチャンという名前で、話しかけらるので、話しかけられたことを検出してください"
```

## 音声合成の設定
Expand Down
Loading
Loading