Skip to content

Fix: Support bare multiplier words in English number recognition#3209

Open
BegBedion wants to merge 1 commit intomicrosoft:masterfrom
BegBedion:fix/bare-multiplier-words
Open

Fix: Support bare multiplier words in English number recognition#3209
BegBedion wants to merge 1 commit intomicrosoft:masterfrom
BegBedion:fix/bare-multiplier-words

Conversation

@BegBedion
Copy link
Copy Markdown

Summary

This PR fixes issue #3208 by allowing recognition of bare multiplier words like 'hundred', 'thousand', 'million', etc. without requiring an explicit coefficient prefix.

Problem

Previously, the English number recognizer only recognized multiplier words when prefixed with a coefficient:

  • ✓ 'one hundred dollars' → $100
  • ✓ 'a hundred dollars' → $100
  • ✗ 'hundred dollars' → Not recognized (should be $100)

This affected all NumberWithUnit models (Currency, Age, Temperature, etc.) since they depend on the Number extractor.

Solution

Modified the SeparaIntRegex pattern in Patterns/English/English-Numbers.yaml to add support for standalone round number words by including an alternative option ({RoundNumberIntegerRegex}).

The pattern now supports three cases:

  1. Base numbers optionally followed by round numbers (e.g., 'twenty hundred')
  2. Articles followed by round numbers (e.g., 'a hundred')
  3. Standalone round numbers (e.g., 'hundred') - NEW

Changes

  • Modified: Patterns/English/English-Numbers.yaml - Updated SeparaIntRegex definition

Testing

This change should be tested with:

  • Recognition of bare multiplier words: 'hundred', 'thousand', 'million', 'billion', 'trillion'
  • Combinations with currency: 'hundred dollars', 'thousand euros', 'million pounds'
  • Combinations with other units through NumberWithUnit models
  • Backward compatibility with existing patterns

Closes #3208

Allow recognition of bare multiplier words like 'hundred', 'thousand',
'million', etc. without requiring a coefficient prefix. This enables
patterns like 'hundred dollars' to be correctly recognized as 00.

Previously only patterns with explicit coefficients were supported:
- 'one hundred dollars' ✓
- 'a hundred dollars' ✓
- 'hundred dollars' ✗ (was not recognized)

Fixes microsoft#3208
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Issue with recognizing bare multiplier words like "hundred dollars"

1 participant