Skip to content

fix(azure): fix double-prosody and 0-1 volume scale when options are provided#41

Merged
OwenMcGirr merged 1 commit into
mainfrom
fix/issue-40-azure-prosody-value-formatting
Apr 11, 2026
Merged

fix(azure): fix double-prosody and 0-1 volume scale when options are provided#41
OwenMcGirr merged 1 commit into
mainfrom
fix/issue-40-azure-prosody-value-formatting

Conversation

@OwenMcGirr
Copy link
Copy Markdown
Collaborator

Problems (fixes #40)

1. Double-nested <prosody>

this.properties defaults (rate="medium", pitch="medium", volume=100) are always truthy, so ensureAzureSSMLStructure always entered the first prosody block and wrapped content with <prosody rate="medium" pitch="medium" volume="100%">. When options were also provided, a second <prosody> element was then wrapped around that, producing invalid double-nested prosody:

<!-- Before fix -->
<voice name="en-US-JennyNeural">
  <prosody rate="fast" pitch="high" volume="80%">
    <prosody rate="medium" pitch="medium" volume="100%">Hello world</prosody>
  </prosody>
</voice>

2. Volume scale mismatch

volume in SpeakOptions is typed 0-100, but callers commonly pass a 0-1 float (e.g. 0.8). The template literal appended % directly, producing volume="0.8%" (essentially silent) instead of volume="80%".

Fix

Merged the two separate prosody-building blocks into one. Options override this.properties defaults. A <prosody> element is only emitted when at least one attribute differs from Azure's implicit defaults (medium/medium/100). Values in the range (0, 1] for volume are detected as fractions and normalised to the 0–100 scale.

<!-- After fix -->
<voice name="en-US-JennyNeural">
  <prosody rate="fast" pitch="high" volume="80%">Hello world</prosody>
</voice>

Tests added

  • Single <prosody> element when options are provided
  • No <prosody> emitted when all values are at Azure defaults
  • volume=0.8 normalises to volume="80%" not "0.8%"

@OwenMcGirr OwenMcGirr merged commit bcc5d56 into main Apr 11, 2026
5 checks passed
…are provided

Two bugs in ensureAzureSSMLStructure when rate/pitch/volume options are passed:

1. Double-nested <prosody>: this.properties defaults (rate="medium", pitch="medium",
   volume=100) are always truthy, so the first block always added a default prosody,
   then the options block wrapped it with a second one. Merged both blocks into one:
   options override properties defaults, and a prosody element is only emitted when
   at least one value differs from Azure's implicit defaults.

2. Volume scale: callers commonly pass volume as a 0-1 float (e.g. 0.8), but the
   template literal appended "%" directly, producing volume="0.8%" (essentially silent)
   instead of volume="80%". Values in the range (0, 1] are now treated as fractions
   and normalised to the 0-100 scale before formatting.

Fixes #40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AzureTTSClient: prosody values not converted to Azure-accepted formats (pitch/volume)

1 participant