Skip to content

docs: add timestamped TTS stream reference#79

Merged
Kilerd merged 3 commits into
mainfrom
codex/tts-with-timestamp-docs
May 8, 2026
Merged

docs: add timestamped TTS stream reference#79
Kilerd merged 3 commits into
mainfrom
codex/tts-with-timestamp-docs

Conversation

@Kilerd
Copy link
Copy Markdown
Contributor

@Kilerd Kilerd commented May 8, 2026

Summary

  • add an API reference page for POST /v1/tts/stream/with-timestamp
  • document SSE audio chunks, timestamp alignment handling, and format guidance
  • link the timestamped streaming endpoint from the core Text to Speech guide

Testing

  • npm run check:openapi
  • npx prettier --check docs.json api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
  • git diff --check

Summary by CodeRabbit

  • New Features

    • Streaming text-to-speech endpoint with real-time timestamp alignment data for synchronized audio and text delivery via server-sent events.
    • Added required model parameter to TTS API for model selection.
  • Documentation

    • New comprehensive guides for streaming TTS with timestamps, including format recommendations and implementation examples.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 8, 2026

Review Change Stack

Warning

Rate limit exceeded

@Kilerd has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 43 minutes and 25 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5edfd0ca-bbb0-4387-ba99-cd5ad8691011

📥 Commits

Reviewing files that changed from the base of the PR and between fc31983 and 118cd13.

📒 Files selected for processing (4)
  • api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
  • api-reference/openapi.json
  • developer-guide/core-features/text-to-speech.mdx
  • docs.json
📝 Walkthrough

Walkthrough

This PR introduces comprehensive documentation for a new TTS streaming endpoint that returns server-sent events containing audio chunks with optional text alignment timing. The change includes OpenAPI schema definitions, detailed endpoint reference documentation with code examples, developer guide integration, and navigation updates.

Changes

TTS Streaming with Timestamps Feature Documentation

Layer / File(s) Summary
Data Contract
api-reference/openapi.json
New schemas TTSTimestampAlignment and TTSTimestampSegment define the timing structure. TTSTimestampStreamEvent is updated to reference alignment type. Required model header parameter is added to POST /v1/tts.
API Specification
api-reference/openapi.json
POST /v1/tts/stream/with-timestamp operation documentation includes required model header, detailed 200-response description of SSE event handling, and extensive examples showing alignment events followed by audio-only chunks. Code samples added for both endpoints.
Endpoint Reference
api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
New endpoint reference page documents SSE response format, alignment structure and timing interpretation, minimal curl request, SSE parsing implementations in Python and Node.js, and format selection guidance (preferring opus over mp3).
Integration
developer-guide/core-features/text-to-speech.mdx, docs.json
New "Streaming with Timestamps" subsection added to core features guide with format recommendations. Page added to REST API navigation under TTS & ASR (v1). Support contact formatting updated.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • fishaudio/docs#51: Modifies OpenAPI spec's TTS model enum and header handling for POST /v1/tts.
  • fishaudio/docs#55: Updates the required model header and default value to "s2-pro" in TTS OpenAPI spec.
  • fishaudio/docs#65: Updates TTS request/Prosody schemas and sample documentation in OpenAPI specification.

Suggested reviewers

  • twangodev

Poem

🐰 A streaming path for voices clear,
With chunks of audio drawing near,
Each segment's timing marked just right,
The docs now guide both day and night! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'docs: add timestamped TTS stream reference' directly and specifically describes the main change: adding documentation for a timestamped text-to-speech streaming endpoint. It accurately summarizes the primary purpose of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/tts-with-timestamp-docs

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Kilerd Kilerd force-pushed the codex/tts-with-timestamp-docs branch from fc31983 to 18b3e99 Compare May 8, 2026 06:55
@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 8, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
hanabiaiinc 🟢 Ready View Preview May 8, 2026, 6:57 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
api-reference/openapi.json (1)

2625-3579: ⚡ Quick win

Consider deduplicating the response examples block.

The 200 response duplicates the same alignment payload in two places: once inside schema.examples (an array, lines 2629–3083) and again under content["text/event-stream"].examples as a map of named examples (lines 3115–3578). This roughly doubles the size of the response definition. OpenAPI tooling typically consumes the named examples map for rendering; the in-schema examples array is rarely necessary alongside it. Removing one of the two would significantly shrink the spec without losing rendered documentation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/openapi.json` around lines 2625 - 3579, The OpenAPI response
duplicates large example payloads: the TTSTimestampStreamEvent's schema.examples
array and the content["text/event-stream"].examples map both contain the same
alignment/audio examples; remove one to reduce spec size—prefer keeping the
named examples under content["text/event-stream"].examples (used by tooling) and
delete the schema.examples array (or vice versa) so only a single source of
truth remains for TTSTimestampStreamEvent examples.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api-reference/openapi.json`:
- Around line 2586-2598: The parameter declaration for the
/v1/tts/stream/with-timestamp endpoint has `required: true` but also a `default:
"s2-pro"`, which is inconsistent; update the parameter in the OpenAPI fragment
that defines the Model (enum ["s1","s2-pro"]) by removing the `default:
"s2-pro"` or alternatively set `required: false`—prefer removing the default so
the parameter remains required and the enum/title/type remain unchanged.
- Around line 2405-2422: The OpenAPI parameter "model" currently has a semantic
contradiction: it is marked required:true while also providing a default
("s2-pro"); choose one resolution—either remove the default field from the
"model" header parameter so it remains required and clients must supply it, or
set "required": false so the default "s2-pro" is meaningful; update the "model"
parameter entry in the parameters array accordingly (leave its enum/title/type
intact) and ensure only one of required or default is present.

---

Nitpick comments:
In `@api-reference/openapi.json`:
- Around line 2625-3579: The OpenAPI response duplicates large example payloads:
the TTSTimestampStreamEvent's schema.examples array and the
content["text/event-stream"].examples map both contain the same alignment/audio
examples; remove one to reduce spec size—prefer keeping the named examples under
content["text/event-stream"].examples (used by tooling) and delete the
schema.examples array (or vice versa) so only a single source of truth remains
for TTSTimestampStreamEvent examples.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 51b4991b-20cc-4007-96ef-4bdb2374101f

📥 Commits

Reviewing files that changed from the base of the PR and between 2569668 and fc31983.

📒 Files selected for processing (4)
  • api-reference/endpoint/openapi-v1/text-to-speech-stream-with-timestamps.mdx
  • api-reference/openapi.json
  • developer-guide/core-features/text-to-speech.mdx
  • docs.json

Comment on lines +2405 to +2422
"parameters": [
{
"in": "header",
"name": "model",
"description": "Specify which TTS model to use. We recommend `s2-pro`.",
"required": true,
"schema": {
"default": "s2-pro",
"enum": [
"s1",
"s2-pro"
],
"title": "Model",
"type": "string"
},
"deprecated": false
}
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

OpenAPI 3.1 required parameter with default value semantics

💡 Result:

In OpenAPI 3.1, a parameter marked required cannot also rely on default semantics to make the parameter “optional”. The spec’s intent is: required controls validation/contract that the client must supply the value; default documents server behavior when the value is missing (so the default value is only applicable when the parameter is not required). 1) What “required” means for parameters - The OpenAPI Parameter Object defines required as a boolean that determines whether the parameter is mandatory; it defaults to false for non-path parameters. For path parameters, required MUST be true. [1] 2) What “default” means for parameters (and where it must be placed) - For request parameters, the default value is specified using the default keyword inside the parameter’s schema (not at the top level of the Parameter Object). Swagger’s parameter documentation states: “Use the default keyword in the parameter schema to specify the default value for an optional parameter… The default value is the one that the server uses if the client does not supply the parameter value.” [2] - Swagger also calls out a mistake: “Using default with required parameters… does not make sense – if a value is required, the client must always send it, and the default value is never used.” [2] - A concrete example of correct placement (schema default vs. Parameter Object default) is reflected in the OpenAPI-upstream behavior discussed in an issue: putting default at the Parameter Object level doesn’t get mapped; it should be inside schema, e.g. schema: { default: 0 }. [3] 3) Answer to the specific question: “OpenAPI 3.1 required parameter with default value semantics” - If a parameter has required: true, then by definition clients must always provide it, so default value semantics are not applicable for validation/absence—because the value will not be missing. This is consistent with Swagger’s guidance that default + required “does not make sense… default value is never used.” [2] Practical guidance - If you want “client may omit the parameter; server uses default”, set required: false (or omit required) and set schema.default to the desired value. [2] - If you want “client must provide the parameter”, set required: true and do not depend on default for missing values. [2][1] Example (optional query param with default) parameters: - name: page in: query required: false schema: type: integer default: 0 Example (required param) parameters: - name: page in: query required: true schema: type: integer # (no default relied upon for absence)

Citations:


🏁 Script executed:

# Check the git history of the openapi.json file to see if model parameter changed recently
cd api-reference && git log --oneline -20 openapi.json 2>/dev/null || echo "No git history available"

Repository: fishaudio/docs

Length of output: 109


🏁 Script executed:

# Search for the model parameter definition in the file to understand full context
rg -A 20 -B 5 '"name": "model"' api-reference/openapi.json | head -100

Repository: fishaudio/docs

Length of output: 1471


🏁 Script executed:

# Check if there are any other references to the model parameter or TTS endpoint
rg -i "tts|model.*header" api-reference/openapi.json | head -20

Repository: fishaudio/docs

Length of output: 2944


🏁 Script executed:

# Check what changed in the recent commit to understand if required field was modified
git show 18b3e99 --stat

Repository: fishaudio/docs

Length of output: 9593


🏁 Script executed:

# Get more detail on what changed to the model parameter specifically
git show 18b3e99 -- api-reference/openapi.json | grep -A 30 -B 5 '"name": "model"' | head -80

Repository: fishaudio/docs

Length of output: 1999


Fix semantic contradiction: required: true with default value is invalid in OpenAPI 3.1.

The model header has both "required": true and "default": "s2-pro", which violates OpenAPI 3.1 specification. Per OpenAPI semantics, defaults are only meaningful for optional parameters; if a parameter is required, clients must always provide it, making the default unreachable and generating invalid SDK code.

Either set "required": false (clients may omit; server uses default) or remove the default field entirely (clients must provide a value).

Suggested fix
           {
             "in": "header",
             "name": "model",
             "description": "Specify which TTS model to use. We recommend `s2-pro`.",
-            "required": true,
+            "required": false,
             "schema": {
               "default": "s2-pro",
               "enum": [
                 "s1",
                 "s2-pro"
               ],
               "title": "Model",
               "type": "string"
             },
             "deprecated": false
           }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"parameters": [
{
"in": "header",
"name": "model",
"description": "Specify which TTS model to use. We recommend `s2-pro`.",
"required": true,
"schema": {
"default": "s2-pro",
"enum": [
"s1",
"s2-pro"
],
"title": "Model",
"type": "string"
},
"deprecated": false
}
],
"parameters": [
{
"in": "header",
"name": "model",
"description": "Specify which TTS model to use. We recommend `s2-pro`.",
"required": false,
"schema": {
"default": "s2-pro",
"enum": [
"s1",
"s2-pro"
],
"title": "Model",
"type": "string"
},
"deprecated": false
}
],
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/openapi.json` around lines 2405 - 2422, The OpenAPI parameter
"model" currently has a semantic contradiction: it is marked required:true while
also providing a default ("s2-pro"); choose one resolution—either remove the
default field from the "model" header parameter so it remains required and
clients must supply it, or set "required": false so the default "s2-pro" is
meaningful; update the "model" parameter entry in the parameters array
accordingly (leave its enum/title/type intact) and ensure only one of required
or default is present.

Comment on lines 2586 to +2598
"required": true,
"schema": {
"type": "string",
"default": "s2-pro",
"enum": [
"s1",
"s2-pro"
]
],
"title": "Model",
"type": "string"
},
"deprecated": false
}
],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Same required/default inconsistency on the timestamp stream endpoint.

The new /v1/tts/stream/with-timestamp parameter declaration mirrors the same pattern: required: true with a default of "s2-pro". Apply the same fix here for consistency and to avoid OpenAPI-3.1 spec ambiguity.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@api-reference/openapi.json` around lines 2586 - 2598, The parameter
declaration for the /v1/tts/stream/with-timestamp endpoint has `required: true`
but also a `default: "s2-pro"`, which is inconsistent; update the parameter in
the OpenAPI fragment that defines the Model (enum ["s1","s2-pro"]) by removing
the `default: "s2-pro"` or alternatively set `required: false`—prefer removing
the default so the parameter remains required and the enum/title/type remain
unchanged.

@Kilerd Kilerd merged commit 7c055a2 into main May 8, 2026
6 checks passed
@Kilerd Kilerd deleted the codex/tts-with-timestamp-docs branch May 8, 2026 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant