Skip to content

feat/kani-tts-2 support#12

Open
soyboyscout wants to merge 2 commits into
mainfrom
claude/add-kani-tts-support-QsdcU
Open

feat/kani-tts-2 support#12
soyboyscout wants to merge 2 commits into
mainfrom
claude/add-kani-tts-support-QsdcU

Conversation

@soyboyscout
Copy link
Copy Markdown
Member

@soyboyscout soyboyscout commented Feb 16, 2026

Summary by cubic

Add Kani TTS 2 as a third TTS engine with voice cloning and HuggingFace auto-download. Adds temperature/top_p/repetition_penalty controls and updates the MCP tool schema and validations.

  • New Features

    • New “kani” engine with speaker embedding voice cloning and GPU-friendly HF model auto-download.
    • Python runner (scripts/kani_runner.py) with automatic pip dependency install on first use.
    • New Kani options in tool schema: temperature, top_p, repetition_penalty, model; input validation added.
    • New env config (.env.example): KANI_MODEL, KANI_REFERENCE_AUDIO, KANI_MAX_CHARACTERS, KANI_OUTPUT_DIR, KANI_TEMPERATURE, KANI_TOP_P, KANI_REPETITION_PENALTY, KANI_LANGUAGE_TAG.
    • Tests added for engine selection, capabilities, and option validation.
  • Migration

    • Set TTS_ENGINE=kani to enable.
    • Ensure Python and pip are available on PATH (or set PYTHON_PATH). First run installs deps and downloads the model.
    • Optional manual setup: pip install -r scripts/requirements-kani.txt.
    • Provide 3–30 seconds of reference audio via KANI_REFERENCE_AUDIO or tool input for voice cloning; adjust temperature/top_p/repetition_penalty as needed.

Written for commit d31b169. Summary will update on new commits.


EntelligenceAI PR Summary

Version bump from 0.1.0 to 0.3.0 with no dependency or functional changes.

  • Updated top-level version field in package-lock.json
  • Updated package version in the packages section
  • No dependency modifications or other changes included
  • Pure version release update

Summary by CodeRabbit

Release Notes

  • New Features

    • Integrated Kani TTS 2 as a new text-to-speech engine option with advanced synthesis controls including temperature, top_p, and repetition_penalty parameters for fine-tuning audio generation.
    • Added voice cloning capabilities for Kani TTS 2 using reference audio files.
  • Tests

    • Added comprehensive test coverage for Kani engine selection, configuration validation, and voice cloning functionality.

Integrate Kani TTS 2 (nineninesix/kani-tts-2-en) as a new engine alongside
Chatterbox and Kokoro. Kani is a 400M-parameter LLM-based TTS model with
voice cloning via speaker embeddings and HuggingFace auto-download support.

New files:
- src/core/kani.engine.ts: TypeScript engine implementing ITTSEngine
- scripts/kani_runner.py: Python CLI runner with voice cloning support
- scripts/requirements-kani.txt: Python dependencies

Modified files:
- tts-constants.ts: Add KANI_DEFAULTS and KANI_LIMITS
- tts-engine.interface.ts: Add temperature, top_p, repetition_penalty, model to TTSOptions
- tts-service.factory.ts: Register "kani" engine type
- tools.ts: Add Kani params to MCP tool schema and pass-through
- .env.example: Add Kani configuration section
- tests: Add engine selection, capabilities, and validation tests

Set TTS_ENGINE=kani to use. Models auto-download from HuggingFace on first run.

https://claude.ai/code/session_01KNVsaVefDTsFt8Do5NeCu4
Copilot AI review requested due to automatic review settings February 16, 2026 04:16
@qodo-code-review
Copy link
Copy Markdown

Review Summary by Qodo

Add Kani TTS 2 engine support with voice cloning and HuggingFace integration

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Add Kani TTS 2 as third TTS engine option with voice cloning support
• Implement KaniEngine class with Python runner for HuggingFace model integration
• Extend TTSOptions interface with temperature, top_p, repetition_penalty, model parameters
• Register Kani engine in factory and update MCP tool schema with new parameters
• Add comprehensive validation tests for Kani-specific options and capabilities
Diagram
flowchart LR
  A["TTSService Factory"] -->|"registers"| B["KaniEngine"]
  B -->|"spawns"| C["kani_runner.py"]
  C -->|"uses"| D["Kani TTS 2 Model"]
  D -->|"auto-downloads from"| E["HuggingFace"]
  B -->|"validates"| F["TTSOptions<br/>temperature, top_p,<br/>repetition_penalty"]
  G["MCP Tools"] -->|"passes parameters"| B
  H["Voice Cloning"] -->|"via speaker embeddings"| B
Loading

Grey Divider

File Changes

1. src/core/kani.engine.ts ✨ Enhancement +376/-0

New Kani TTS 2 engine implementation with voice cloning

src/core/kani.engine.ts


2. scripts/kani_runner.py ✨ Enhancement +157/-0

Python CLI runner for Kani TTS 2 synthesis

scripts/kani_runner.py


3. scripts/requirements-kani.txt Dependencies +4/-0

Python dependencies for Kani TTS 2

scripts/requirements-kani.txt


View more (6)
4. src/core/tts-constants.ts ⚙️ Configuration changes +18/-0

Add Kani defaults and parameter limits

src/core/tts-constants.ts


5. src/core/tts-engine.interface.ts ✨ Enhancement +6/-0

Extend TTSOptions with Kani-specific parameters

src/core/tts-engine.interface.ts


6. src/core/tts-service.factory.ts ✨ Enhancement +7/-1

Register Kani engine in TTS service factory

src/core/tts-service.factory.ts


7. src/mcp/tools.ts ✨ Enhancement +49/-2

Add Kani parameters to MCP synthesize tool schema

src/mcp/tools.ts


8. tests/core/tts-service.factory.test.ts 🧪 Tests +73/-1

Add Kani engine selection and validation tests

tests/core/tts-service.factory.test.ts


9. .env.example 📝 Documentation +26/-1

Add Kani TTS 2 configuration section

.env.example


Grey Divider

Qodo Logo

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 16, 2026

📝 Walkthrough

Walkthrough

This pull request introduces support for Kani TTS 2, a new text-to-speech engine. It includes environment configuration, a Python CLI orchestrator script, TypeScript engine wrapper, TTS system integration across constants, interfaces, and factory, MCP tools support, and comprehensive unit tests.

Changes

Cohort / File(s) Summary
Environment Configuration
.env.example
Extends TTS_ENGINE options to include "kani" and adds 8 new KANI_* environment variables for model selection, audio reference, generation parameters, output directory, and language configuration.
Python Implementation
scripts/kani_runner.py, scripts/requirements-kani.txt
Introduces CLI script for Kani TTS 2 synthesis with argument validation, dependency management, speaker embedding extraction, and WAV output generation; adds Python package dependencies (kani-tts-2, transformers, soundfile, numpy).
Core Engine
src/core/kani.engine.ts
New KaniEngine class wrapping Kani TTS 2 with Python path resolution, dependency verification/installation, readiness handshake, synthesis orchestration, status reporting, and option validation.
TTS System Integration
src/core/tts-constants.ts, src/core/tts-engine.interface.ts, src/core/tts-service.factory.ts
Adds KANI_DEFAULTS and KANI_LIMITS constants; extends TTSOptions interface with temperature, top_p, repetition_penalty, and model fields; updates factory to instantiate KaniEngine when TTS_ENGINE is "kani".
MCP Tools
src/mcp/tools.ts
Threads new TTS parameters (temperature, top_p, repetition_penalty, model) through synthesizeText method signature, tool schema, and validation; updates tool description to mention Kani TTS 2 and expanded voice cloning.
Tests
tests/core/tts-service.factory.test.ts
Adds engine selection, status/interface, and option validation tests for Kani engine, including case-insensitive handling and Kani-specific capability assertions.

Sequence Diagram

sequenceDiagram
    participant Client
    participant KaniEngine as KaniEngine<br/>(TypeScript)
    participant PythonRunner as kani_runner.py<br/>(Python CLI)
    participant KaniTTS as Kani TTS 2<br/>Library
    participant FileSystem as Output<br/>WAV File

    Client->>KaniEngine: ensureReady()
    KaniEngine->>KaniEngine: Verify Python path & script
    KaniEngine->>PythonRunner: Check dependencies
    PythonRunner-->>KaniEngine: Status
    Note over KaniEngine: Install deps if needed

    Client->>KaniEngine: synthesize(text, options)
    KaniEngine->>KaniEngine: Validate options (temp, top_p, etc.)
    KaniEngine->>PythonRunner: Invoke with args + model config
    PythonRunner->>KaniTTS: Initialize KaniTTS(model, reference_audio)
    KaniTTS->>KaniTTS: Extract speaker embedding
    PythonRunner->>KaniTTS: Generate audio(text, kwargs)
    KaniTTS-->>PythonRunner: Audio buffer
    PythonRunner->>FileSystem: Write WAV
    PythonRunner-->>KaniEngine: stdout/stderr + exit code
    KaniEngine-->>Client: Resolved path to WAV file
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

Review effort 3/5

Poem

🐰 Hoppy news, the Kani hops in,
New voices singing, where can we begin?
Temperature tuned, embeddings aligned,
A Python-TypeScript romance intertwined,
More engines, more choices, the chorus expands! 🎵

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding support for Kani TTS 2 as a new text-to-speech engine.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/add-kani-tts-support-QsdcU

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@socket-security
Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedpypi/​transformers@​4.56.074100100100100
Addedpypi/​kani-tts-2@​0.0.5100100100100100

View full report

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @soyboyscout, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for Kani TTS 2, a new text-to-speech engine, into the system. This integration provides users with an additional powerful option for speech synthesis, featuring capabilities like voice cloning via speaker embeddings and the use of HuggingFace-hosted models. The changes involve adding a Python runner script and its dependencies, updating configuration examples, implementing a new TypeScript engine, and extending the TTS service factory and tool definitions to seamlessly incorporate Kani TTS 2.

Highlights

  • Kani TTS 2 Integration: Introduced Kani TTS 2 as a new text-to-speech engine, expanding the available options alongside Chatterbox and Kokoro.
  • Python Runner Script: Added a dedicated Python script (kani_runner.py) to interface with the Kani TTS 2 library, handling text input, speech generation, and voice cloning.
  • Configuration Options: Updated the .env.example file to include environment variables for configuring Kani TTS 2, such as model selection, reference audio, and generation parameters.
  • TypeScript Engine Implementation: Implemented KaniEngine in TypeScript to manage the Kani TTS 2 lifecycle, including dependency verification, automatic installation, and speech synthesis.
  • Tool Schema Updates: Modified the synthesize_text tool to support Kani-specific parameters like temperature, top_p, repetition_penalty, and model selection, enhancing its flexibility.
Changelog
  • .env.example
    • Updated the TTS_ENGINE options to include "kani"
    • Added new configuration variables for Kani TTS 2, including KANI_MODEL, KANI_REFERENCE_AUDIO, KANI_MAX_CHARACTERS, KANI_OUTPUT_DIR, KANI_TEMPERATURE, KANI_TOP_P, KANI_REPETITION_PENALTY, and KANI_LANGUAGE_TAG
  • package-lock.json
    • Updated the project version from 0.1.0 to 0.3.0
  • scripts/kani_runner.py
    • Added a new Python script to serve as the command-line interface for Kani TTS 2, handling text-to-speech generation and voice cloning
  • scripts/requirements-kani.txt
    • Added a new file specifying Python dependencies required for Kani TTS 2, including kani-tts-2, transformers, soundfile, and numpy
  • src/core/kani.engine.ts
    • Added a new TypeScript file implementing the ITTSEngine interface for Kani TTS 2, managing Python script execution, dependency checks, and speech synthesis
  • src/core/tts-constants.ts
    • Added KANI_DEFAULTS and KANI_LIMITS constants for Kani TTS 2 configuration parameters
  • src/core/tts-engine.interface.ts
    • Extended the TTSOptions interface to include Kani TTS 2-specific parameters such as temperature, top_p, repetition_penalty, and model
  • src/core/tts-service.factory.ts
    • Imported KaniEngine
    • Updated TTSEngineType to include "kani"
    • Modified the TTSService constructor to recognize and instantiate KaniEngine when TTS_ENGINE is set to "kani"
  • src/mcp/tools.ts
    • Updated the synthesize_text tool's input schema to include Kani TTS 2-specific parameters (temperature, top_p, repetition_penalty, model)
    • Modified the description of synthesize_text to mention Kani TTS 2 support for voice cloning
    • Updated TTSToolSchemas.synthesizeText to include validation for Kani-specific parameters
  • tests/core/tts-service.factory.test.ts
    • Added new test cases to verify that the TTSService correctly selects the Kani engine when configured
    • Added tests to ensure Kani engine capabilities are correctly reported
    • Added tests for validating Kani-specific options (temperature, top_p, repetition_penalty)
Activity
  • No specific activity (comments, reviews, or progress updates) was provided in the context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@matterai-app
Copy link
Copy Markdown

matterai-app Bot commented Feb 16, 2026

Summary By MatterAI MatterAI logo

🔄 What Changed

This PR introduces full support for the Kani TTS 2 engine. It adds a new Python runner script (kani_runner.py), a TypeScript engine implementation (KaniEngine), and integrates Kani-specific parameters (temperature, top_p, repetition penalty, and model selection) into the MCP tool schema and TTS service factory.

🔍 Impact of the Change

Users can now leverage Kani TTS 2 for high-quality, local speech synthesis with advanced voice cloning capabilities. The system automatically handles Python dependency installation and model downloading from HuggingFace, providing a seamless experience for generating expressive audio via the synthesize_text tool.

📁 Total Files Changed

Click to Expand
File ChangeLog
Config Update .env.example Added environment variables for Kani model, limits, and generation parameters.
Python Runner scripts/kani_runner.py New CLI script to interface with the Kani TTS 2 library and handle audio generation.
Dependency List scripts/requirements-kani.txt Defined required Python packages: kani-tts-2, transformers, soundfile, and numpy.
Engine Logic src/core/kani.engine.ts Implemented KaniEngine to manage Python process spawning and environment setup.
Constants Definition src/core/tts-constants.ts Added default values and validation limits for Kani TTS 2.
Interface Extension src/core/tts-engine.interface.ts Updated TTSOptions to include Kani-specific generation parameters.
Factory Integration src/core/tts-service.factory.ts Registered KaniEngine in the factory and added engine type validation.
Tool Schema src/mcp/tools.ts Updated MCP tool definitions to expose Kani parameters to LLMs.
Unit Tests tests/core/tts-service.factory.test.ts Added test cases for Kani engine selection, capabilities, and option validation.

🧪 Test Added/Recommended

Added

  • tests/core/tts-service.factory.test.ts: Validates that the TTSService correctly instantiates KaniEngine based on environment variables and correctly validates Kani-specific options (temperature, top_p, etc.).

Recommended

  • Integration Test: Add a test that mocks the Python process execution to ensure KaniEngine correctly parses stdout/stderr and handles non-zero exit codes.
  • File Cleanup Test: Verify that temporary audio files are correctly managed or cleaned up after synthesis.

🔒 Security Vulnerabilities

  • Command Injection: The KaniEngine uses spawn with arguments derived from environment variables and user input. While spawn is safer than exec, ensure this.pythonPath and this.scriptPath are strictly validated to prevent execution of malicious binaries.
  • Path Traversal: KANI_OUTPUT_DIR and referenceAudio paths should be sanitized to ensure they don't point outside intended directories. 🛡️

@entelligence-ai-pr-reviews
Copy link
Copy Markdown

Walkthrough

This pull request contains a version bump for the project, incrementing from version 0.1.0 to 0.3.0. This represents a minor version update following semantic versioning conventions. The change is isolated to the package-lock.json file, updating both the top-level version field and the corresponding package version entry within the packages section. No dependencies have been added, removed, or updated, and no other configuration changes are present. This is a straightforward release version update with no functional code changes.

Changes

File(s) Summary
package-lock.json Version bumped from 0.1.0 to 0.3.0 in both the top-level version field and the packages section. No dependency changes included.

🔗 Cross-Repository Impact Analysis

Enable automatic detection of breaking changes across your dependent repositories. → Set up now

Learn more about Cross-Repository Analysis

What It Does

  • Automatically identifies repositories that depend on this code
  • Analyzes potential breaking changes across your entire codebase
  • Provides risk assessment before merging to prevent cross-repo issues

How to Enable

  1. Visit Settings → Code Management
  2. Configure repository dependencies
  3. Future PRs will automatically include cross-repo impact analysis!

Benefits

  • 🛡️ Prevent breaking changes across repositories
  • 🔍 Catch integration issues before they reach production
  • 📊 Better visibility into your multi-repo architecture

▶️AI Code Reviews for VS Code, Cursor, Windsurf
Install the extension

Note for Windsurf Please change the default marketplace provider to the following in the windsurf settings:

Marketplace Extension Gallery Service URL: https://marketplace.visualstudio.com/_apis/public/gallery

Marketplace Gallery Item URL: https://marketplace.visualstudio.com/items

Entelligence.ai can learn from your feedback. Simply add 👍 / 👎 emojis to teach it your preferences. More shortcuts below

Emoji Descriptions:

  • ⚠️ Potential Issue - May require further investigation.
  • 🔒 Security Vulnerability - Fix to ensure system safety.
  • 💻 Code Improvement - Suggestions to enhance code quality.
  • 🔨 Refactor Suggestion - Recommendations for restructuring code.
  • ℹ️ Others - General comments and information.

Interact with the Bot:

  • Send a message or request using the format:
    @entelligenceai + *your message*
Example: @entelligenceai Can you suggest improvements for this code?
  • Help the Bot learn by providing feedback on its responses.
    @entelligenceai + *feedback*
Example: @entelligenceai Do not comment on `save_auth` function !

Also you can trigger various commands with the bot by doing
@entelligenceai command

The current supported commands are

  1. config - shows the current config
  2. retrigger_review - retriggers the review

More commands to be added soon.

Comment thread src/core/kani.engine.ts
Comment on lines +253 to +265
if (referenceAudio) {
args.push("--reference_audio", referenceAudio);
}

if (languageTag) {
args.push("--language_tag", languageTag);
}

logger.log("[KaniEngine] Python path:", this.pythonPath);
logger.log("[KaniEngine] Script path:", this.scriptPath);
logger.log("[KaniEngine] Arguments:", args);

return new Promise<string>((resolve, reject) => {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path Traversal (🔒 Security, 🔴 High) - The code uses external input "referenceAudio" to build a file path without validating it, allowing attackers to access files outside the allowed directory. View in Corgea ↗

More Details
🎟️Issue Explanation: The code uses external input "referenceAudio" to build a file path without validating it, allowing attackers to access files outside the allowed directory.

- "referenceAudio" can include path traversal characters like "../", letting attackers reach unauthorized files.
- The check only pushes this input to "args" without sanitizing or restricting the path location.
- This can expose sensitive files if the path resolves outside the restricted directory, risking data leaks or unauthorized access.

🪄Fix Explanation: The fix mitigates path traversal by resolving the input against a safe base directory, validating it stays within that directory, and confirming the target file exists and is a file before use.
The code sets "baseDir" to a fixed safe path ("os.tmpdir()"), ensuring resolution is relative to this directory.
It resolves "referenceAudio" via "path.resolve(baseDir, referenceAudio)", normalizing the path.
The relative path is computed with "path.relative(baseDir, resolvedRef)", and traversal attempts are detected by checking if it starts with ".." or is absolute.
Existence and file-type checks use "fs.existsSync(resolvedRef)" and "fs.statSync(resolvedRef).isFile()" to prevent invalid or malicious inputs.
Only after passing these validations is the resolved, safe path appended to "args", preventing injection of arbitrary file paths.

💡Important Instructions: Ensure referenceAudio inputs come from trusted sources or include user input validation upstream to avoid legitimate path rejection.
Suggested change
if (referenceAudio) {
args.push("--reference_audio", referenceAudio);
}
if (languageTag) {
args.push("--language_tag", languageTag);
}
logger.log("[KaniEngine] Python path:", this.pythonPath);
logger.log("[KaniEngine] Script path:", this.scriptPath);
logger.log("[KaniEngine] Arguments:", args);
return new Promise<string>((resolve, reject) => {
if (referenceAudio) {
const baseDir = path.resolve(os.tmpdir());
const resolvedRef = path.resolve(baseDir, referenceAudio);
const relativeRef = path.relative(baseDir, resolvedRef);
if (relativeRef.startsWith("..") || path.isAbsolute(relativeRef)) {
throw new Error("Invalid referenceAudio path: path traversal detected");
}
if (!fs.existsSync(resolvedRef) || !fs.statSync(resolvedRef).isFile()) {
throw new Error("Invalid referenceAudio path: file does not exist or is not a file");
}
args.push("--reference_audio", resolvedRef);
}

@corgea
Copy link
Copy Markdown

corgea Bot commented Feb 16, 2026

🐕 Corgea found the following new SCA issues in the codebase:

Package CVE Severity Version Fixed Version Ecosystem Summary
@modelcontextprotocol/sdk CVE-2026-0621 HIGH 1.12.1 1.25.2 npm Anthropic's MCP TypeScript SDK has a ReDoS vulnerability
axios CVE-2026-25639 HIGH 1.9.0 1.13.5 npm Axios is Vulnerable to Denial of Service via proto Key in mergeConfig
qs CVE-2026-2391 LOW 6.14.0 6.14.2 npm qs's arrayLimit bypass in comma parsing allows denial of service
brace-expansion CVE-2025-5889 LOW 1.1.11 1.1.12 npm brace-expansion Regular Expression Denial of Service vulnerability
diff CVE-2026-24001 LOW 4.0.2 4.0.4 npm jsdiff has a Denial of Service vulnerability in parsePatch and applyPatch

Showing 5 out of 8 findings. See full results

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the Kani TTS 2 engine, a significant feature addition. While the implementation includes a new Python runner, a TypeScript engine class, configuration updates, and tests, a critical security vulnerability exists regarding input validation for the new engine. Specifically, the referenceAudio and model parameters are passed to the underlying Python script without proper sanitization and path validation, potentially allowing an attacker to probe the server's filesystem. It is recommended to implement strict path validation and parameter sanitization in the KaniEngine class before merging. Additionally, for better maintainability, consider referencing the requirements-kani.txt file instead of hardcoding dependency lists in error messages.

Comment thread src/core/kani.engine.ts
Comment on lines +223 to +224
const referenceAudio =
options?.referenceAudio || process.env.KANI_REFERENCE_AUDIO || "";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The KaniEngine implementation fails to validate the referenceAudio path provided in the TTSOptions. Unlike the ChatterboxEngine, which performs rigorous checks to prevent path traversal and restrict access to allowed directories, KaniEngine passes the user-supplied path directly to the underlying Python script. In src/mcp/tools.ts, the validation is explicitly skipped for this engine (line 126). This allows an attacker to specify arbitrary file paths on the server, potentially leading to information disclosure or unauthorized file access if the kani-tts-2 library or the runner script processes the file.

Comment thread src/core/kani.engine.ts
Comment on lines +206 to +208
const model =
options?.model ??
(process.env.KANI_MODEL || KANI_DEFAULTS.MODEL);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The model parameter is taken directly from user input without any validation or sanitization. This parameter is passed as a command-line argument to the Python runner script. An attacker could provide a malicious model name or a local file path, which might lead to SSRF (if the library attempts to download a model from an untrusted source) or unauthorized local file access.

Comment thread scripts/kani_runner.py
Comment on lines +25 to +26
print("Please install required packages:", file=sys.stderr)
print(" pip install kani-tts-2 transformers==4.56.0 soundfile numpy", file=sys.stderr)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better maintainability, instead of hardcoding the installation command, it would be better to refer to the scripts/requirements-kani.txt file. This ensures that if the dependencies change in the requirements file, this error message remains accurate.

Suggested change
print("Please install required packages:", file=sys.stderr)
print(" pip install kani-tts-2 transformers==4.56.0 soundfile numpy", file=sys.stderr)
print("Please install required packages from 'scripts/requirements-kani.txt':", file=sys.stderr)
print(" pip install -r scripts/requirements-kani.txt", file=sys.stderr)

Comment thread src/core/kani.engine.ts
Comment on lines +167 to +169
`Failed to install Kani dependencies automatically.\n` +
`Please install manually:\n` +
` pip install kani-tts-2 transformers==4.56.0 soundfile numpy\n\n` +
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To improve maintainability, it would be better to reference the scripts/requirements-kani.txt file in this error message instead of hardcoding the package list. This ensures the instructions remain correct if the requirements file is updated.

Suggested change
`Failed to install Kani dependencies automatically.\n` +
`Please install manually:\n` +
` pip install kani-tts-2 transformers==4.56.0 soundfile numpy\n\n` +
`Failed to install Kani dependencies automatically.\n` +
`Please install manually from 'scripts/requirements-kani.txt':\n` +
` pip install -r scripts/requirements-kani.txt\n\n` +

Comment thread src/core/kani.engine.ts
Comment on lines +179 to +181
`Failed to run pip installer: ${error.message}\n` +
`Please install dependencies manually:\n` +
` pip install kani-tts-2 transformers==4.56.0 soundfile numpy`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency and maintainability, let's also refer to the scripts/requirements-kani.txt file here instead of hardcoding the dependency list.

Suggested change
`Failed to run pip installer: ${error.message}\n` +
`Please install dependencies manually:\n` +
` pip install kani-tts-2 transformers==4.56.0 soundfile numpy`
`Failed to run pip installer: ${error.message}\n` +
`Please install dependencies manually from 'scripts/requirements-kani.txt':\n` +
` pip install -r scripts/requirements-kani.txt`

Copy link
Copy Markdown

@matterai-app matterai-app Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧪 PR Review is completed: Review of Kani TTS 2 integration. Identified critical robustness issues in configuration parsing (NaN handling), reference audio logic, and filename generation.

Skipped files
  • package-lock.json: Skipped file pattern
⬇️ Low Priority Suggestions (1)
scripts/kani_runner.py (1 suggestion)

Location: scripts/kani_runner.py (Lines 34-34)

🟡 Configuration

Issue: The sample rate is hardcoded to 22050Hz. If Kani TTS 2 uses a different sample rate (e.g., 24kHz), the output audio will have incorrect pitch and speed.

Fix: Verify the model's sample rate and update the constant if necessary. Adding a TODO for verification.

Impact: Audio quality assurance.

-  SAMPLE_RATE = 22050
+  SAMPLE_RATE = 22050  # TODO: Verify if Kani TTS 2 uses 24kHz or 22050Hz

Comment thread src/core/kani.engine.ts
Comment on lines +197 to +235
const maxCharacters = parseInt(
process.env.KANI_MAX_CHARACTERS || String(KANI_DEFAULTS.MAX_CHARACTERS)
);
if (text.length > maxCharacters) {
throw new Error(
`Text exceeds maximum character limit of ${maxCharacters} characters. Current length: ${text.length}`
);
}

const model =
options?.model ??
(process.env.KANI_MODEL || KANI_DEFAULTS.MODEL);
const temperature =
options?.temperature ??
parseFloat(
process.env.KANI_TEMPERATURE || String(KANI_DEFAULTS.TEMPERATURE)
);
const topP =
options?.top_p ??
parseFloat(process.env.KANI_TOP_P || String(KANI_DEFAULTS.TOP_P));
const repetitionPenalty =
options?.repetition_penalty ??
parseFloat(
process.env.KANI_REPETITION_PENALTY ||
String(KANI_DEFAULTS.REPETITION_PENALTY)
);
const referenceAudio =
options?.referenceAudio || process.env.KANI_REFERENCE_AUDIO || "";
const languageTag =
options?.language ?? (process.env.KANI_LANGUAGE_TAG || "");

const outputDir =
process.env.KANI_OUTPUT_DIR ||
path.join(os.tmpdir(), COMMON_CONSTANTS.TEMP_DIR_NAME);
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}

const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Logic Error & Robustness

Issue:

  1. parseFloat/parseInt on invalid environment variables returns NaN, which propagates to the Python script and bypasses validation (since NaN < 0 is false).
  2. Using || for referenceAudio prevents the user from explicitly disabling it (passing "") if the environment variable is set.
  3. Date.now() for filename generation can cause collisions in high-concurrency scenarios.
  4. maxCharacters fallback logic fails for invalid non-empty strings.

Fix:

  1. Check for isNaN when parsing environment variables.
  2. Use nullish coalescing (??) for optional parameters.
  3. Add a random suffix to the output filename.

Impact: Prevents runtime errors, ensures configuration is respected, and avoids file overwrites.

Suggested change
const maxCharacters = parseInt(
process.env.KANI_MAX_CHARACTERS || String(KANI_DEFAULTS.MAX_CHARACTERS)
);
if (text.length > maxCharacters) {
throw new Error(
`Text exceeds maximum character limit of ${maxCharacters} characters. Current length: ${text.length}`
);
}
const model =
options?.model ??
(process.env.KANI_MODEL || KANI_DEFAULTS.MODEL);
const temperature =
options?.temperature ??
parseFloat(
process.env.KANI_TEMPERATURE || String(KANI_DEFAULTS.TEMPERATURE)
);
const topP =
options?.top_p ??
parseFloat(process.env.KANI_TOP_P || String(KANI_DEFAULTS.TOP_P));
const repetitionPenalty =
options?.repetition_penalty ??
parseFloat(
process.env.KANI_REPETITION_PENALTY ||
String(KANI_DEFAULTS.REPETITION_PENALTY)
);
const referenceAudio =
options?.referenceAudio || process.env.KANI_REFERENCE_AUDIO || "";
const languageTag =
options?.language ?? (process.env.KANI_LANGUAGE_TAG || "");
const outputDir =
process.env.KANI_OUTPUT_DIR ||
path.join(os.tmpdir(), COMMON_CONSTANTS.TEMP_DIR_NAME);
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}
const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);
const maxCharacters = parseInt(process.env.KANI_MAX_CHARACTERS || "") || KANI_DEFAULTS.MAX_CHARACTERS;
if (text.length > maxCharacters) {
throw new Error(
`Text exceeds maximum character limit of ${maxCharacters} characters. Current length: ${text.length}`
);
}
const model =
options?.model ??
(process.env.KANI_MODEL || KANI_DEFAULTS.MODEL);
const envTemp = parseFloat(process.env.KANI_TEMPERATURE || "");
const temperature =
options?.temperature ??
(isNaN(envTemp) ? KANI_DEFAULTS.TEMPERATURE : envTemp);
const envTopP = parseFloat(process.env.KANI_TOP_P || "");
const topP =
options?.top_p ??
(isNaN(envTopP) ? KANI_DEFAULTS.TOP_P : envTopP);
const envRepPenalty = parseFloat(process.env.KANI_REPETITION_PENALTY || "");
const repetitionPenalty =
options?.repetition_penalty ??
(isNaN(envRepPenalty) ? KANI_DEFAULTS.REPETITION_PENALTY : envRepPenalty);
const referenceAudio =
options?.referenceAudio ?? process.env.KANI_REFERENCE_AUDIO ?? "";
const languageTag =
options?.language ?? (process.env.KANI_LANGUAGE_TAG || "");
const outputDir =
process.env.KANI_OUTPUT_DIR ||
path.join(os.tmpdir(), COMMON_CONSTANTS.TEMP_DIR_NAME);
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}
const outputFile = path.join(outputDir, `kani-tts-${Date.now()}-${Math.random().toString(36).slice(2)}.wav`);

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 10 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/core/tts-service.factory.ts">

<violation number="1" location="src/core/tts-service.factory.ts:28">
P2: Engine selection only accepts "kani"; the README documents "kani-mlx" as the valid env value, so setting the documented value will default to chatterbox instead of selecting Kani.</violation>
</file>

<file name="src/core/kani.engine.ts">

<violation number="1" location="src/core/kani.engine.ts:224">
P1: Missing path validation for `referenceAudio`. The chatterbox engine validates reference audio paths with `validateReferenceAudioPath()` to prevent directory traversal, but the kani engine passes user-supplied paths directly to the subprocess. Add path validation (e.g., checking allowed extensions, ensuring the path is within expected directories, and verifying the file exists) before passing it to the Python script.</violation>

<violation number="2" location="src/core/kani.engine.ts:235">
P2: `Date.now()` alone is insufficient for unique filenames under concurrent requests. Two requests arriving in the same millisecond will collide. Use `crypto.randomUUID()` or append a random suffix (e.g., `crypto.randomBytes(4).toString('hex')`) to ensure uniqueness.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread src/core/kani.engine.ts
String(KANI_DEFAULTS.REPETITION_PENALTY)
);
const referenceAudio =
options?.referenceAudio || process.env.KANI_REFERENCE_AUDIO || "";
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Missing path validation for referenceAudio. The chatterbox engine validates reference audio paths with validateReferenceAudioPath() to prevent directory traversal, but the kani engine passes user-supplied paths directly to the subprocess. Add path validation (e.g., checking allowed extensions, ensuring the path is within expected directories, and verifying the file exists) before passing it to the Python script.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/core/kani.engine.ts, line 224:

<comment>Missing path validation for `referenceAudio`. The chatterbox engine validates reference audio paths with `validateReferenceAudioPath()` to prevent directory traversal, but the kani engine passes user-supplied paths directly to the subprocess. Add path validation (e.g., checking allowed extensions, ensuring the path is within expected directories, and verifying the file exists) before passing it to the Python script.</comment>

<file context>
@@ -0,0 +1,376 @@
+          String(KANI_DEFAULTS.REPETITION_PENALTY)
+      );
+    const referenceAudio =
+      options?.referenceAudio || process.env.KANI_REFERENCE_AUDIO || "";
+    const languageTag =
+      options?.language ?? (process.env.KANI_LANGUAGE_TAG || "");
</file context>
Fix with Cubic

// Validate and set engine type
if (envEngine === "kokoro") {
this.engineType = "kokoro";
} else if (envEngine === "kani") {
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Engine selection only accepts "kani"; the README documents "kani-mlx" as the valid env value, so setting the documented value will default to chatterbox instead of selecting Kani.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/core/tts-service.factory.ts, line 28:

<comment>Engine selection only accepts "kani"; the README documents "kani-mlx" as the valid env value, so setting the documented value will default to chatterbox instead of selecting Kani.</comment>

<file context>
@@ -24,6 +25,8 @@ export class TTSService {
     // Validate and set engine type
     if (envEngine === "kokoro") {
       this.engineType = "kokoro";
+    } else if (envEngine === "kani") {
+      this.engineType = "kani";
     } else if (envEngine && envEngine !== "chatterbox") {
</file context>
Suggested change
} else if (envEngine === "kani") {
} else if (envEngine === "kani" || envEngine === "kani-mlx") {
Fix with Cubic

Comment thread src/core/kani.engine.ts
fs.mkdirSync(outputDir, { recursive: true });
}

const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Date.now() alone is insufficient for unique filenames under concurrent requests. Two requests arriving in the same millisecond will collide. Use crypto.randomUUID() or append a random suffix (e.g., crypto.randomBytes(4).toString('hex')) to ensure uniqueness.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At src/core/kani.engine.ts, line 235:

<comment>`Date.now()` alone is insufficient for unique filenames under concurrent requests. Two requests arriving in the same millisecond will collide. Use `crypto.randomUUID()` or append a random suffix (e.g., `crypto.randomBytes(4).toString('hex')`) to ensure uniqueness.</comment>

<file context>
@@ -0,0 +1,376 @@
+      fs.mkdirSync(outputDir, { recursive: true });
+    }
+
+    const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);
+
+    const args = [
</file context>
Fix with Cubic

@qodo-code-review
Copy link
Copy Markdown

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (5) 📎 Requirement gaps (0)

Grey Divider


Action required

1. Logs include raw --text 📘 Rule violation ⛨ Security
Description
The Kani engine logs the full spawned argument list, which includes the user-provided TTS text and
potentially sensitive file paths. This can leak sensitive data into logs and violates secure logging
requirements.
Code

src/core/kani.engine.ts[R261-264]

+    logger.log("[KaniEngine] Python path:", this.pythonPath);
+    logger.log("[KaniEngine] Script path:", this.scriptPath);
+    logger.log("[KaniEngine] Arguments:", args);
+
Evidence
Secure logging requires that no sensitive data is present in logs; the logged args array includes
--text with the full input string and may include --reference_audio paths.

Rule 5: Generic: Secure Logging Practices
src/core/kani.engine.ts[261-264]
src/core/kani.engine.ts[237-251]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`KaniEngine` logs the full spawned argument list, which includes the raw user-provided `text` (and possibly reference audio paths). This can leak sensitive information into logs.

## Issue Context
Secure logging requires avoiding sensitive data in logs. TTS input text may contain PII/secrets, and file paths can also be sensitive.

## Fix Focus Areas
- src/core/kani.engine.ts[237-264]
- src/core/kani.engine.ts[271-285]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


2. traceback.print_exc exposed to user 📘 Rule violation ⛨ Security
Description
The CLI runner prints a full stack trace to stderr on any exception, exposing internal
implementation details to the end user. This violates secure error handling requirements.
Code

scripts/kani_runner.py[R150-153]

+    except Exception as e:
+        print(f"Error: {e}", file=sys.stderr)
+        traceback.print_exc(file=sys.stderr)
+        sys.exit(1)
Evidence
The secure error handling rule prohibits exposing stack traces or internal details to user-facing
outputs; the script prints the exception and stack trace to stderr.

Rule 4: Generic: Secure Error Handling
scripts/kani_runner.py[150-153]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The runner prints a full stack trace (`traceback.print_exc`) to stderr, which is user-facing for a CLI and leaks internal implementation details.

## Issue Context
Compliance requires generic user-facing errors and detailed stack traces only in secure/internal logs.

## Fix Focus Areas
- scripts/kani_runner.py[150-153]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Unvalidated env numeric parsing 📘 Rule violation ⛯ Reliability
Description
Numeric values parsed from environment variables can become NaN and bypass length/range checks or
be forwarded to the Python runner. This creates unreliable behavior and weak edge-case handling for
external inputs.
Code

src/core/kani.engine.ts[R197-223]

+    const maxCharacters = parseInt(
+      process.env.KANI_MAX_CHARACTERS || String(KANI_DEFAULTS.MAX_CHARACTERS)
+    );
+    if (text.length > maxCharacters) {
+      throw new Error(
+        `Text exceeds maximum character limit of ${maxCharacters} characters. Current length: ${text.length}`
+      );
+    }
+
+    const model =
+      options?.model ??
+      (process.env.KANI_MODEL || KANI_DEFAULTS.MODEL);
+    const temperature =
+      options?.temperature ??
+      parseFloat(
+        process.env.KANI_TEMPERATURE || String(KANI_DEFAULTS.TEMPERATURE)
+      );
+    const topP =
+      options?.top_p ??
+      parseFloat(process.env.KANI_TOP_P || String(KANI_DEFAULTS.TOP_P));
+    const repetitionPenalty =
+      options?.repetition_penalty ??
+      parseFloat(
+        process.env.KANI_REPETITION_PENALTY ||
+          String(KANI_DEFAULTS.REPETITION_PENALTY)
+      );
+    const referenceAudio =
Evidence
Robust error handling and input validation require explicit handling of null/empty/boundary cases;
here parseInt/parseFloat results are not checked for NaN or validity before use.

Rule 3: Generic: Robust Error Handling and Edge Case Management
Rule 6: Generic: Security-First Input Validation and Data Handling
src/core/kani.engine.ts[197-203]
src/core/kani.engine.ts[209-223]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`parseInt`/`parseFloat` results from env vars are not validated. If an env var is non-numeric, values can become `NaN`, causing length/range checks to fail open or invalid values to be passed downstream.

## Issue Context
These values are external inputs (env) and should be validated with explicit edge-case handling.

## Fix Focus Areas
- src/core/kani.engine.ts[197-223]
- src/core/kani.engine.ts[235-251]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


View more (2)
4. Kani limit mismatch 🐞 Bug ✓ Correctness
Description
When TTS_ENGINE=kani, the HTTP server and MCP tool still enforce CHATTERBOX_MAX_CHARACTERS (or
Kokoro’s), ignoring KANI_MAX_CHARACTERS introduced by this PR. This makes the Kani config
misleading and can reject valid requests unexpectedly.
Code

src/core/tts-service.factory.ts[R25-32]

    // Validate and set engine type
    if (envEngine === "kokoro") {
      this.engineType = "kokoro";
+    } else if (envEngine === "kani") {
+      this.engineType = "kani";
    } else if (envEngine && envEngine !== "chatterbox") {
      logger.warn(
        `Unknown TTS engine: ${envEngine}. Defaulting to chatterbox.`
Evidence
This PR enables selecting the new kani engine, but the request-layer character-limit logic only
branches for Kokoro and otherwise defaults to Chatterbox. Meanwhile, KaniEngine itself enforces
KANI_MAX_CHARACTERS and .env.example documents it—creating inconsistent and surprising behavior
across layers.

src/core/tts-service.factory.ts[21-49]
src/server.ts[105-114]
src/mcp/tools.ts[100-113]
src/core/kani.engine.ts[197-204]
.env.example[53-57]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
When `TTS_ENGINE=kani`, the request-layer (HTTP and MCP) max-length checks still default to Chatterbox’s env var. This makes `KANI_MAX_CHARACTERS` ineffective and creates inconsistent behavior (request rejected before reaching `KaniEngine`, or enforced by a different limit than the engine).

### Issue Context
- `KaniEngine` already enforces `KANI_MAX_CHARACTERS`.
- `.env.example` documents `KANI_MAX_CHARACTERS`.
- The HTTP and MCP layers should enforce the same engine-specific limit to provide consistent UX and predictable configuration.

### Fix Focus Areas
- src/server.ts[105-125]
- src/mcp/tools.ts[100-122]
- src/core/kani.engine.ts[197-204]
- src/core/tts-service.factory.ts[21-49]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


5. Output dir override breaks 🐞 Bug ⛯ Reliability
Description
KANI_OUTPUT_DIR is supported and documented, but the HTTP and MCP layers enforce a strict security
boundary requiring outputs under os.tmpdir()/local-voice-mcp. Setting KANI_OUTPUT_DIR outside
that directory will cause synthesis to fail at the boundary check.
Code

src/core/kani.engine.ts[R228-236]

+    const outputDir =
+      process.env.KANI_OUTPUT_DIR ||
+      path.join(os.tmpdir(), COMMON_CONSTANTS.TEMP_DIR_NAME);
+    if (!fs.existsSync(outputDir)) {
+      fs.mkdirSync(outputDir, { recursive: true });
+    }
+
+    const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);
+
Evidence
KaniEngine can write outputs to any directory via KANI_OUTPUT_DIR, but both MCP and HTTP
handlers reject returned paths that are not under the fixed temp dir
(os.tmpdir()/local-voice-mcp). .env.example introduces KANI_OUTPUT_DIR without warning about
this constraint, so using it as advertised will produce errors.

src/core/kani.engine.ts[228-236]
src/mcp/tools.ts[144-154]
src/server.ts[134-143]
.env.example[56-57]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

### Issue description
`KANI_OUTPUT_DIR` allows outputs outside the temp directory, but both HTTP and MCP layers reject such paths. This makes the config option effectively broken (or produces confusing runtime failures).

### Issue Context
The project intentionally restricts playable/servable audio files to a temp directory boundary for security. Any engine-level output directory override must adhere to that boundary.

### Fix Focus Areas
- src/core/kani.engine.ts[228-236]
- src/mcp/tools.ts[144-154]
- src/server.ts[134-143]
- .env.example[56-57]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

6. Generic Kani synthesis failed error 📘 Rule violation ⛯ Reliability
Description
When the Python process fails, the returned error is a generic message that drops actionable context
like exit code and failure reason. This hinders debugging and violates robust error handling
expectations for meaningful context.
Code

src/core/kani.engine.ts[R309-315]

+          logger.error(
+            `[KaniEngine] Synthesis failed after ${duration}ms with code ${code}`
+          );
+          logger.error("[KaniEngine] Error output (stderr):", stderrData);
+          logger.error("[KaniEngine] Standard output (stdout):", stdoutData);
+          reject(new Error("Kani synthesis failed"));
+        }
Evidence
The robust error handling rule requires errors to include actionable context; here the rejection
uses a generic message even though exit code and stderr/stdout are available.

Rule 3: Generic: Robust Error Handling and Edge Case Management
src/core/kani.engine.ts[309-315]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The promise rejects with `new Error(&quot;Kani synthesis failed&quot;)`, which loses actionable context (exit code, reason). This makes failures harder to diagnose.

## Issue Context
Error messages should provide meaningful context while remaining safe (avoid including raw user text or sensitive paths).

## Fix Focus Areas
- src/core/kani.engine.ts[297-315]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


7. referenceAudio path not constrained 📘 Rule violation ⛨ Security
Description
The engine forwards referenceAudio and KANI_OUTPUT_DIR directly to filesystem operations and the
Python process without constraining them to an allowed base directory. If these values are
user-influenced upstream, this can enable unintended filesystem access.
Code

src/core/kani.engine.ts[R223-259]

+    const referenceAudio =
+      options?.referenceAudio || process.env.KANI_REFERENCE_AUDIO || "";
+    const languageTag =
+      options?.language ?? (process.env.KANI_LANGUAGE_TAG || "");
+
+    const outputDir =
+      process.env.KANI_OUTPUT_DIR ||
+      path.join(os.tmpdir(), COMMON_CONSTANTS.TEMP_DIR_NAME);
+    if (!fs.existsSync(outputDir)) {
+      fs.mkdirSync(outputDir, { recursive: true });
+    }
+
+    const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);
+
+    const args = [
+      this.scriptPath,
+      "--text",
+      text,
+      "--output",
+      outputFile,
+      "--model",
+      model,
+      "--temperature",
+      String(temperature),
+      "--top_p",
+      String(topP),
+      "--repetition_penalty",
+      String(repetitionPenalty),
+    ];
+
+    if (referenceAudio) {
+      args.push("--reference_audio", referenceAudio);
+    }
+
+    if (languageTag) {
+      args.push("--language_tag", languageTag);
+    }
Evidence
Security-first input validation requires validating/sanitizing external inputs; these paths are
taken from options/env and used to read/write without any allowlist or base-dir enforcement.

Rule 6: Generic: Security-First Input Validation and Data Handling
src/core/kani.engine.ts[223-233]
src/core/kani.engine.ts[253-259]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`referenceAudio` and `KANI_OUTPUT_DIR` are used without base-directory constraints. If these values can be influenced by users upstream, this may allow reading arbitrary files as &quot;audio&quot; or writing output outside expected directories.

## Issue Context
Compliance requires validation and sanitization of external inputs, including paths.

## Fix Focus Areas
- src/core/kani.engine.ts[223-233]
- src/core/kani.engine.ts[235-255]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

ⓘ The new review experience is currently in Beta. Learn more

Grey Divider

Qodo Logo

Comment thread src/core/kani.engine.ts
Comment on lines +261 to +264
logger.log("[KaniEngine] Python path:", this.pythonPath);
logger.log("[KaniEngine] Script path:", this.scriptPath);
logger.log("[KaniEngine] Arguments:", args);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. Logs include raw --text 📘 Rule violation ⛨ Security

The Kani engine logs the full spawned argument list, which includes the user-provided TTS text and
potentially sensitive file paths. This can leak sensitive data into logs and violates secure logging
requirements.
Agent Prompt
## Issue description
`KaniEngine` logs the full spawned argument list, which includes the raw user-provided `text` (and possibly reference audio paths). This can leak sensitive information into logs.

## Issue Context
Secure logging requires avoiding sensitive data in logs. TTS input text may contain PII/secrets, and file paths can also be sensitive.

## Fix Focus Areas
- src/core/kani.engine.ts[237-264]
- src/core/kani.engine.ts[271-285]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread scripts/kani_runner.py
Comment on lines +150 to +153
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
traceback.print_exc(file=sys.stderr)
sys.exit(1)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. traceback.print_exc exposed to user 📘 Rule violation ⛨ Security

The CLI runner prints a full stack trace to stderr on any exception, exposing internal
implementation details to the end user. This violates secure error handling requirements.
Agent Prompt
## Issue description
The runner prints a full stack trace (`traceback.print_exc`) to stderr, which is user-facing for a CLI and leaks internal implementation details.

## Issue Context
Compliance requires generic user-facing errors and detailed stack traces only in secure/internal logs.

## Fix Focus Areas
- scripts/kani_runner.py[150-153]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/core/kani.engine.ts
Comment on lines +197 to +223
const maxCharacters = parseInt(
process.env.KANI_MAX_CHARACTERS || String(KANI_DEFAULTS.MAX_CHARACTERS)
);
if (text.length > maxCharacters) {
throw new Error(
`Text exceeds maximum character limit of ${maxCharacters} characters. Current length: ${text.length}`
);
}

const model =
options?.model ??
(process.env.KANI_MODEL || KANI_DEFAULTS.MODEL);
const temperature =
options?.temperature ??
parseFloat(
process.env.KANI_TEMPERATURE || String(KANI_DEFAULTS.TEMPERATURE)
);
const topP =
options?.top_p ??
parseFloat(process.env.KANI_TOP_P || String(KANI_DEFAULTS.TOP_P));
const repetitionPenalty =
options?.repetition_penalty ??
parseFloat(
process.env.KANI_REPETITION_PENALTY ||
String(KANI_DEFAULTS.REPETITION_PENALTY)
);
const referenceAudio =
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Unvalidated env numeric parsing 📘 Rule violation ⛯ Reliability

Numeric values parsed from environment variables can become NaN and bypass length/range checks or
be forwarded to the Python runner. This creates unreliable behavior and weak edge-case handling for
external inputs.
Agent Prompt
## Issue description
`parseInt`/`parseFloat` results from env vars are not validated. If an env var is non-numeric, values can become `NaN`, causing length/range checks to fail open or invalid values to be passed downstream.

## Issue Context
These values are external inputs (env) and should be validated with explicit edge-case handling.

## Fix Focus Areas
- src/core/kani.engine.ts[197-223]
- src/core/kani.engine.ts[235-251]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment on lines 25 to 32
// Validate and set engine type
if (envEngine === "kokoro") {
this.engineType = "kokoro";
} else if (envEngine === "kani") {
this.engineType = "kani";
} else if (envEngine && envEngine !== "chatterbox") {
logger.warn(
`Unknown TTS engine: ${envEngine}. Defaulting to chatterbox.`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Kani limit mismatch 🐞 Bug ✓ Correctness

When TTS_ENGINE=kani, the HTTP server and MCP tool still enforce CHATTERBOX_MAX_CHARACTERS (or
Kokoro’s), ignoring KANI_MAX_CHARACTERS introduced by this PR. This makes the Kani config
misleading and can reject valid requests unexpectedly.
Agent Prompt
### Issue description
When `TTS_ENGINE=kani`, the request-layer (HTTP and MCP) max-length checks still default to Chatterbox’s env var. This makes `KANI_MAX_CHARACTERS` ineffective and creates inconsistent behavior (request rejected before reaching `KaniEngine`, or enforced by a different limit than the engine).

### Issue Context
- `KaniEngine` already enforces `KANI_MAX_CHARACTERS`.
- `.env.example` documents `KANI_MAX_CHARACTERS`.
- The HTTP and MCP layers should enforce the same engine-specific limit to provide consistent UX and predictable configuration.

### Fix Focus Areas
- src/server.ts[105-125]
- src/mcp/tools.ts[100-122]
- src/core/kani.engine.ts[197-204]
- src/core/tts-service.factory.ts[21-49]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Comment thread src/core/kani.engine.ts
Comment on lines +228 to +236
const outputDir =
process.env.KANI_OUTPUT_DIR ||
path.join(os.tmpdir(), COMMON_CONSTANTS.TEMP_DIR_NAME);
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}

const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

5. Output dir override breaks 🐞 Bug ⛯ Reliability

KANI_OUTPUT_DIR is supported and documented, but the HTTP and MCP layers enforce a strict security
boundary requiring outputs under os.tmpdir()/local-voice-mcp. Setting KANI_OUTPUT_DIR outside
that directory will cause synthesis to fail at the boundary check.
Agent Prompt
### Issue description
`KANI_OUTPUT_DIR` allows outputs outside the temp directory, but both HTTP and MCP layers reject such paths. This makes the config option effectively broken (or produces confusing runtime failures).

### Issue Context
The project intentionally restricts playable/servable audio files to a temp directory boundary for security. Any engine-level output directory override must adhere to that boundary.

### Fix Focus Areas
- src/core/kani.engine.ts[228-236]
- src/mcp/tools.ts[144-154]
- src/server.ts[134-143]
- .env.example[56-57]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/mcp/tools.ts (1)

864-874: ⚠️ Potential issue | 🟡 Minor

TTSToolSchemas.synthesizeText is missing Kokoro-specific fields that exist in the method signature and inputSchema.

The schema lacks speed, language, voice, model_path, and voices_path—all of which are defined in the inputSchema and accepted by the synthesizeText method. While the actual tool validation uses inputSchema rather than TTSToolSchemas, this inconsistency should be resolved to keep the schema complete and to avoid confusion if TTSToolSchemas is relied upon elsewhere. Additionally, test coverage for these fields is missing from the TTSToolSchemas test suite.

🤖 Fix all issues with AI agents
In `@src/core/kani.engine.ts`:
- Around line 265-317: Add a configurable timeout to the Promise that spawns the
Python TTS process so the caller cannot hang indefinitely: introduce a timeout
value (e.g., this.synthesisTimeout or a parameter) and start a timer after
spawning the process; if the timer fires before childProcess closes, call
childProcess.kill() (or childProcess.kill('SIGKILL') as appropriate), log the
timeout including stderrData/stdoutData, clear the timer on normal close, and
reject the Promise with a descriptive timeout Error instead of leaving it
unresolved; ensure you reference the existing childProcess variable, spawn call
(this.pythonPath, args), outputFile existence check, and clear the timer in the
childProcess.on("close") and childProcess.on("error") handlers.
🧹 Nitpick comments (6)
scripts/kani_runner.py (3)

100-113: Remove extraneous f prefixes from strings without placeholders.

Lines 102 and 113 use f-strings with no interpolation. As flagged by Ruff (F541), drop the f prefix.

🔧 Proposed fix
-        print(f"[MAIN] Initializing Kani TTS 2...", file=sys.stderr)
+        print("[MAIN] Initializing Kani TTS 2...", file=sys.stderr)
-            print(f"[MAIN] Speaker embedding extracted successfully", file=sys.stderr)
+            print("[MAIN] Speaker embedding extracted successfully", file=sys.stderr)

127-132: Unused variable and misleading truncation suffix.

  1. processed_text is never used — prefix with _ to signal intent.
  2. The log on line 128 always appends ... even when the text is shorter than 100 characters.
🔧 Proposed fix
-        print(f"[MAIN] Generating speech for: '{args.text[:100]}...'", file=sys.stderr)
+        text_preview = args.text[:100] + ("..." if len(args.text) > 100 else "")
+        print(f"[MAIN] Generating speech for: '{text_preview}'", file=sys.stderr)
         print(f"[MAIN] Temperature: {args.temperature}, top_p: {args.top_p}, "
               f"rep_penalty: {args.repetition_penalty}", file=sys.stderr)
 
-        audio, processed_text = model(args.text, **gen_kwargs)
+        audio, _processed_text = model(args.text, **gen_kwargs)

29-34: Hardcoded SAMPLE_RATE duplicates the value in tts-constants.ts (AUDIO_CONFIG.SAMPLE_RATE).

If the Kani model actually outputs at a different sample rate (e.g., 24 kHz or 44.1 kHz), writing the WAV with 22050 will produce pitched/sped-up audio. Consider sourcing this from the model output or making it a CLI argument so the TypeScript side can pass it through.

tests/core/tts-service.factory.test.ts (1)

151-184: Consider adding upper-bound validation tests for top_p and repetition_penalty.

The current tests validate the lower-bound violations (top_p: -0.1, repetition_penalty: 0.5) but don't test upper-bound violations (e.g., top_p: 1.5, repetition_penalty: 2.5). Adding these would ensure the upper bounds defined in KANI_LIMITS are also enforced.

🧪 Proposed additions after the existing invalid cases
       // Invalid repetition_penalty should throw
       expect(() => {
         service["engine"].validateOptions({
           repetition_penalty: 0.5, // Out of range (min is 1.0)
         });
       }).toThrow();
+
+      // Invalid top_p upper bound should throw
+      expect(() => {
+        service["engine"].validateOptions({
+          top_p: 1.5, // Out of range (max is 1.0)
+        });
+      }).toThrow();
+
+      // Invalid repetition_penalty upper bound should throw
+      expect(() => {
+        service["engine"].validateOptions({
+          repetition_penalty: 2.5, // Out of range (max is 2.0)
+        });
+      }).toThrow();
     });
src/core/kani.engine.ts (2)

197-199: Pass explicit radix to parseInt.

parseInt without a radix can produce unexpected results for strings with leading zeros in some contexts. Always specify radix 10 explicitly.

🔧 Proposed fix
-    const maxCharacters = parseInt(
-      process.env.KANI_MAX_CHARACTERS || String(KANI_DEFAULTS.MAX_CHARACTERS)
+    const maxCharacters = parseInt(
+      process.env.KANI_MAX_CHARACTERS || String(KANI_DEFAULTS.MAX_CHARACTERS),
+      10
     );

371-375: shutdown doesn't kill any in-flight synthesis child process.

If synthesize() is in progress when shutdown() is called, the spawned Python process will continue running. Consider tracking the active child process and killing it on shutdown.

Comment thread src/core/kani.engine.ts
Comment on lines +265 to +317
return new Promise<string>((resolve, reject) => {
const childProcess = spawn(this.pythonPath, args);

let stderrData = "";
let stdoutData = "";

if (childProcess.stderr) {
childProcess.stderr.on("data", (data) => {
const chunk = data.toString();
stderrData += chunk;
logger.log("[KaniEngine] stderr:", chunk.trim());
});
}

if (childProcess.stdout) {
childProcess.stdout.on("data", (data) => {
const chunk = data.toString();
stdoutData += chunk;
logger.log("[KaniEngine] stdout:", chunk.trim());
});
}

childProcess.on("error", (error) => {
logger.error("[KaniEngine] process error:", error);
reject(error);
});

const startTime = Date.now();
logger.log(
"[KaniEngine] TTS process started. Waiting for completion..."
);

childProcess.on("close", (code) => {
const duration = Date.now() - startTime;
logger.log(
`[KaniEngine] Process closed with code ${code} after ${duration}ms`
);

if (code === 0 && fs.existsSync(outputFile)) {
logger.log(
`[KaniEngine] Synthesis completed successfully in ${duration}ms`
);
resolve(outputFile);
} else {
logger.error(
`[KaniEngine] Synthesis failed after ${duration}ms with code ${code}`
);
logger.error("[KaniEngine] Error output (stderr):", stderrData);
logger.error("[KaniEngine] Standard output (stdout):", stdoutData);
reject(new Error("Kani synthesis failed"));
}
});
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

No timeout on the synthesis child process — risk of indefinite hang.

If the Python process hangs (OOM, GPU deadlock, model download stall), this promise never settles, blocking the caller forever. Consider adding a configurable timeout that kills the child process.

⏱️ Proposed fix — add a timeout
     return new Promise<string>((resolve, reject) => {
       const childProcess = spawn(this.pythonPath, args);
+      const TIMEOUT_MS = 300_000; // 5 minutes
+      const timer = setTimeout(() => {
+        childProcess.kill("SIGTERM");
+        reject(new Error(`Kani synthesis timed out after ${TIMEOUT_MS / 1000}s`));
+      }, TIMEOUT_MS);

       let stderrData = "";
       let stdoutData = "";
       // ... existing stream handlers ...

       childProcess.on("close", (code) => {
+        clearTimeout(timer);
         const duration = Date.now() - startTime;
         // ... existing close handler ...
       });
     });
🤖 Prompt for AI Agents
In `@src/core/kani.engine.ts` around lines 265 - 317, Add a configurable timeout
to the Promise that spawns the Python TTS process so the caller cannot hang
indefinitely: introduce a timeout value (e.g., this.synthesisTimeout or a
parameter) and start a timer after spawning the process; if the timer fires
before childProcess closes, call childProcess.kill() (or
childProcess.kill('SIGKILL') as appropriate), log the timeout including
stderrData/stdoutData, clear the timer on normal close, and reject the Promise
with a descriptive timeout Error instead of leaving it unresolved; ensure you
reference the existing childProcess variable, spawn call (this.pythonPath,
args), outputFile existence check, and clear the timer in the
childProcess.on("close") and childProcess.on("error") handlers.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d31b169f9f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +28 to +29
} else if (envEngine === "kani") {
this.engineType = "kani";
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle Kani limits in HTTP pre-validation

By adding envEngine === "kani" here, the service now exposes a third engine type, but the /tts HTTP handler still only branches on kokoro vs fallback (src/server.ts lines 109-114), so Kani requests are pre-validated against CHATTERBOX_MAX_CHARACTERS instead of KANI_MAX_CHARACTERS. In deployments where those limits differ, valid Kani input is rejected early (or error semantics become inconsistent), so the server-side limit selection needs a dedicated Kani branch.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for Kani TTS 2, a new text-to-speech engine that uses HuggingFace-hosted models for voice synthesis with advanced features like voice cloning via speaker embeddings and fine-grained control over generation parameters.

Changes:

  • Added complete Kani TTS 2 engine implementation with automatic dependency installation and HuggingFace model integration
  • Extended the MCP tools API to support Kani-specific parameters (temperature, top_p, repetition_penalty, model)
  • Added comprehensive test coverage for Kani engine selection, capabilities, and parameter validation

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/core/kani.engine.ts New Kani TTS 2 engine implementation with Python integration, dependency management, and synthesis capabilities
src/core/tts-service.factory.ts Updated factory to include "kani" as a valid engine type and instantiate KaniEngine
src/core/tts-engine.interface.ts Extended TTSOptions interface with Kani-specific parameters (temperature, top_p, repetition_penalty, model)
src/core/tts-constants.ts Added KANI_DEFAULTS and KANI_LIMITS constants for configuration and validation
src/mcp/tools.ts Extended synthesizeText tool to accept and pass through Kani parameters, updated tool description and schema
scripts/kani_runner.py Python script that interfaces with kani-tts library for audio generation with validation and error handling
scripts/requirements-kani.txt Python dependencies for Kani TTS 2 (kani-tts-2, transformers, soundfile, numpy)
tests/core/tts-service.factory.test.ts Added comprehensive test cases for Kani engine selection, capabilities, and validation
.env.example Added Kani TTS 2 configuration section with environment variables and documentation
package-lock.json Version bump to 0.3.0 (should be 0.4.0)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/core/kani.engine.ts
Comment on lines +223 to +254
const referenceAudio =
options?.referenceAudio || process.env.KANI_REFERENCE_AUDIO || "";
const languageTag =
options?.language ?? (process.env.KANI_LANGUAGE_TAG || "");

const outputDir =
process.env.KANI_OUTPUT_DIR ||
path.join(os.tmpdir(), COMMON_CONSTANTS.TEMP_DIR_NAME);
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}

const outputFile = path.join(outputDir, `kani-tts-${Date.now()}.wav`);

const args = [
this.scriptPath,
"--text",
text,
"--output",
outputFile,
"--model",
model,
"--temperature",
String(temperature),
"--top_p",
String(topP),
"--repetition_penalty",
String(repetitionPenalty),
];

if (referenceAudio) {
args.push("--reference_audio", referenceAudio);
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reference audio path should be validated before being passed to the Python script, similar to how ChatterboxEngine does it. Without validation, this could allow path traversal attacks, access to non-existent files, or unsupported file formats. Consider implementing a validateReferenceAudioPath method (similar to ChatterboxEngine lines 157-211) and calling it here when referenceAudio is provided.

Copilot uses AI. Check for mistakes.
Comment thread .env.example

# TTS Engine Selection
# Options: "chatterbox" (default), "kokoro"
# Options: "chatterbox" (default), "kokoro", "kani"
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README.md should be updated to document the new Kani TTS 2 engine. Currently it states "Local Voice MCP supports two TTS engines" (line 27) but with this PR there are now three engines. A new section should be added describing Kani TTS 2's features, setup instructions, and configuration options, similar to the sections for Chatterbox and Kokoro engines.

Suggested change
# Options: "chatterbox" (default), "kokoro", "kani"
# Options: "chatterbox" (default), "kokoro", "kani" (Kani TTS 2)

Copilot uses AI. Check for mistakes.
Comment thread .env.example

# TTS Engine Selection
# Options: "chatterbox" (default), "kokoro"
# Options: "chatterbox" (default), "kokoro", "kani"
Copy link

Copilot AI Feb 16, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CHANGELOG.md should be updated to document the addition of Kani TTS 2 support in this release. This is a significant new feature that should be documented in the "Added" section, including key capabilities like voice cloning via speaker embeddings, HuggingFace model integration, and the new parameters (temperature, top_p, repetition_penalty, model).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants