Skip to content

feat(i18n): add normative keyword internationalization#840

Open
santif wants to merge 6 commits intoFission-AI:mainfrom
santif:feat/add-i18n-normative-keywords
Open

feat(i18n): add normative keyword internationalization#840
santif wants to merge 6 commits intoFission-AI:mainfrom
santif:feat/add-i18n-normative-keywords

Conversation

@santif
Copy link

@santif santif commented Mar 13, 2026

Summary

  • Add language field to openspec/config.yaml for localized normative keyword validation (e.g., es for Spanish: DEBE, DEBERA, DEBERÁ)
  • Implement keyword registry, Unicode-safe regex matching (\p{L} boundaries), and dynamic validation messages
  • Thread language config into the Validator across single-item and bulk validation paths
  • Add 31 tests covering keyword regex, config parsing, and i18n validation (English, Spanish, unknown language fallback, edge cases with English keywords in Spanish text)
  • Document language config in README.md and docs/multi-language.md

Test plan

  • All 47 i18n + validation tests pass (npx vitest run test/core/validation.test.ts test/core/i18n-keywords.test.ts)
  • Enriched messages tests pass (test/core/validation.enriched-messages.test.ts)
  • Validate command tests pass (test/commands/validate.test.ts)
  • Manual: set language: es in config.yaml and validate a Spanish spec with DEBE/DEBERÁ
  • Manual: verify unknown language code logs warning and falls back to English

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Multi-language normative keyword support: set project language in config to validate and guide using language-specific keywords (e.g., Spanish keywords instead of English).
  • Documentation

    • Added i18n guidance with config examples, supported languages/keywords, fallback behavior, and how to extend the keyword registry.
    • Note: same guidance appears in two README sections (duplicated).
  • Tests

    • Added unit and integration tests covering language parsing, keyword matching and localized messages.

Santiago Fernandez and others added 4 commits March 12, 2026 22:23
…ation

Introduce OpenSpec change artifacts (proposal, specs, design, tasks) for
supporting language-specific normative keywords in validation (e.g.,
DEBE/DEBERÁ for Spanish instead of MUST/SHALL).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… refactor

- Add src/core/i18n/keywords.ts with NORMATIVE_KEYWORDS registry (en/es),
  buildKeywordRegex() using Unicode \p{L} boundaries, formatKeywordMessage()
- Add language field to ProjectConfigSchema with resilient parsing
- Refactor base.schema.ts, spec.schema.ts, change.schema.ts to factory
  functions (createRequirementSchema, createSpecSchema, createChangeSchema)
  with backward-compatible default exports
- Update Validator to accept language param, precompile keyword regex,
  and use language-aware schema validation
- Add 22 new tests (18 keyword + 4 config)

Remaining: thread language into Validator call sites, integration tests,
README docs (tasks 3.4-3.6, 4.1)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ests, and docs

- Convert REQUIREMENT_NO_SHALL, GUIDE_MISSING_SPEC_SECTIONS, and
  GUIDE_SCENARIO_FORMAT to keyword-aware functions in constants.ts
- Thread language from project config into Validator in validate.ts
- Add 10 i18n validation tests covering English, Spanish, unknown
  language fallback, error messages, and English-keyword-in-Spanish
  edge cases
- Document language config in README.md and docs/multi-language.md
- Mark all 13/13 tasks complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Archive add-i18n-normative-keywords to openspec/changes/archive/.
Sync delta specs to main: add i18n-normative-keywords spec, update
config-loading and cli-validate specs with language-aware requirements.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@santif santif requested a review from TabishB as a code owner March 13, 2026 03:24
Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

This PR adds i18n support for normative keywords: a keyword registry (en, es), config parsing for an optional language field, language-aware regex/message utilities, schema factory functions that accept language, and propagates language through the Validator and CLI validate command. Tests and docs updated accordingly.

Changes

Cohort / File(s) Summary
i18n Infrastructure
src/core/i18n/keywords.ts
New module: NORMATIVE_KEYWORDS, DEFAULT_LANGUAGE, buildKeywordRegex(), formatKeywordMessage(), resolveKeywords() and language-resolution logic with warnings.
Configuration
src/core/project-config.ts
ProjectConfigSchema now accepts optional language?: string; readProjectConfig parses/validates language and emits warnings on invalid values.
Schema Factories
src/core/schemas/base.schema.ts, src/core/schemas/spec.schema.ts, src/core/schemas/change.schema.ts, src/core/schemas/index.ts
Introduced createRequirementSchema/createSpecSchema/createDeltaSchema/createChangeSchema factories (language?: string). Default exports preserved by instantiating factories with 'en'.
Validation & Constants
src/core/validation/validator.ts, src/core/validation/constants.ts
Validator constructor gains optional language param; validator now resolves keywords, precompiles regex, and uses language-aware schemas; validation message constants converted to functions accepting keyword lists.
Commands
src/commands/validate.ts
Added resolveLanguage() helper, import of readProjectConfig, and threaded language argument to Validator instantiation in validation flows.
Docs & Specs
README.md, docs/multi-language.md, openspec/specs/*, openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/*
Added README i18n guidance (duplicated in Usage Notes), multi-language doc page, design/proposal/tasks/specs describing Unicode-aware keyword matching, config examples, and spec scenarios.
Tests
test/core/i18n-keywords.test.ts, test/core/project-config.test.ts, test/core/validation.test.ts
New/expanded tests for keyword regex, message formatting, language resolution, config parsing behavior, and language-aware validation (English/Spanish/unknown-language cases).

Sequence Diagram(s)

sequenceDiagram
    participant Config as Config File (openspec/config.yaml)
    participant Reader as readProjectConfig()
    participant CLI as validate command
    participant Validator as Validator
    participant i18n as Keywords Module
    participant Schema as Schema Factory
    participant Out as Validation Output

    Config->>Reader: read config
    Reader->>CLI: ProjectConfig (language?)
    CLI->>Validator: instantiate(strict, language?)
    Validator->>i18n: resolveKeywords(language?)
    i18n-->>Validator: keywords list
    Validator->>i18n: buildKeywordRegex(keywords)
    Validator->>Schema: createSpecSchema(language)
    Schema-->>Validator: spec/change schemas (language-aware)
    Validator->>Schema: validate(requirement)
    Schema-->>Out: result with localized messages
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Poem

🐰 Hopped through configs, keywords in tow,

"MUST" and "DEBE" now properly show.
Unicode boundaries snug and tight,
Languages danced into validation's light,
A carrot-sized cheer for regex done right.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding internationalization support for normative keywords. It directly reflects the core feature being introduced without being vague or misleading.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can suggest fixes for GitHub Check annotations.

Configure the reviews.tools.github-checks setting to adjust the time to wait for GitHub Checks to complete.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
test/core/i18n-keywords.test.ts (1)

76-79: Verify: Comma before "or" with two keywords is intentional?

The expected format "MUST", or "SHALL" includes a comma before "or" when there are only two keywords. Standard English typically uses "X or Y" without a comma for two items, reserving the serial/Oxford comma for three or more items (e.g., "X, Y, or Z").

If this is intentional for consistency with the 3-keyword case, consider adding a brief comment in the implementation explaining the style choice. Otherwise, the 2-keyword case might read more naturally as "MUST" or "SHALL".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/core/i18n-keywords.test.ts` around lines 76 - 79, The test expects a
comma before "or" for two keywords but standard style omits that comma; update
the implementation in formatKeywordMessage to special-case when keywords.length
=== 2 and join as '"X" or "Y"' (no comma), and then update the corresponding
test case "should format two keywords" to expect 'Requirement must contain
"MUST" or "SHALL"'; alternatively, if the comma is intentional, add a clarifying
comment in formatKeywordMessage explaining the style choice and keep the test
as-is—reference formatKeywordMessage and the "should format two keywords" test
to locate the changes.
src/core/schemas/base.schema.ts (1)

3-3: Keep the keyword error formatter single-sourced.

formatKeywordMessage() here duplicates VALIDATION_MESSAGES.REQUIREMENT_NO_SHALL() in src/core/validation/constants.ts. If one copy changes, schema errors and validator errors will drift. Reusing the existing validation-message helper would avoid that split brain.

♻️ Small cleanup
-import { buildKeywordRegex, formatKeywordMessage, resolveKeywords } from '../i18n/keywords.js';
+import { buildKeywordRegex, resolveKeywords } from '../i18n/keywords.js';
...
-  const message = formatKeywordMessage(keywords);
+  const message = VALIDATION_MESSAGES.REQUIREMENT_NO_SHALL(keywords);

Also applies to: 13-15

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/schemas/base.schema.ts` at line 3, The duplicated keyword error
formatting in base.schema (using formatKeywordMessage) should be replaced with
the single canonical helper VALIDATION_MESSAGES.REQUIREMENT_NO_SHALL to avoid
divergence; locate the call sites that currently invoke formatKeywordMessage
(and any associated buildKeywordRegex/resolveKeywords usage) and refactor them
to call VALIDATION_MESSAGES.REQUIREMENT_NO_SHALL(...) instead, passing the same
keyword/context parameters so the output remains identical, and remove the local
duplicate formatter import to ensure all schema and validator messages come from
the single source of truth.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/tasks.md`:
- Line 5: The regex in the task description is malformed: replace the incorrect
(?\p{L}) token with a negative lookahead (?!\p{L}) so the intended boundary
becomes (?<!\p{L})... (?!\p{L}) and ensure the `u` flag is specified; update the
`buildKeywordRegex()` description (and any examples) in the task to use this
corrected pattern and keep references to NORMATIVE_KEYWORDS, DEFAULT_LANGUAGE,
and formatKeywordMessage() intact.

In `@openspec/specs/cli-validate/spec.md`:
- Line 29: Remove the inline edit artifact "← (was: hardcoded SHALL)" from the
requirement text so the line reads only "Show compliant example using the
normative keywords from the configured language (e.g., `DEBE` for Spanish,
`SHALL` for English)"; update the string that currently contains that trailing
note (search for the exact sentence containing "Show compliant example using the
normative keywords" or the tokens `DEBE`/`SHALL`) to eliminate the parenthetical
edit note and leave the clean requirement text.

In `@openspec/specs/i18n-normative-keywords/spec.md`:
- Around line 23-27: The spec and implementation disagree: when an unknown
language code is given the spec requires logging a warning listing available
language codes but resolveKeywords (src/core/i18n/keywords.ts) silently falls
back to English and the Validator constructor that calls it doesn't warn; fix by
modifying resolveKeywords to detect an unsupported language and emit a warning
via the project's logging facility (e.g., logger.warn or the existing app
logger) that includes the requested language and the list of available language
codes, or alternatively add that same warning in the Validator constructor
immediately after resolveKeywords is called so the fallback to 'en' is logged.

In `@src/core/i18n/keywords.ts`:
- Around line 50-54: In resolveKeywords, normalize the incoming language (e.g.,
language = language.trim().toLowerCase()) before lookup and change the fallback
behavior so that when a language string is provided but not found in
NORMATIVE_KEYWORDS the function returns undefined (instead of silently returning
NORMATIVE_KEYWORDS[DEFAULT_LANGUAGE]); keep the existing behavior of returning
NORMATIVE_KEYWORDS[DEFAULT_LANGUAGE] only when language is omitted/undefined.
This uses resolveKeywords, NORMATIVE_KEYWORDS and DEFAULT_LANGUAGE to allow
callers to detect unsupported languages and warn.
- Around line 25-28: The current buildKeywordRegex in function buildKeywordRegex
uses (?<!\p{L})... (?!\p{L}) which only blocks adjacent letters and still
matches inside tokens like MUST-API or MUST2; update the lookarounds around
pattern (the pattern variable) to assert that the preceding and following
character are NOT any Unicode letter, number, connector punctuation or dash
punctuation (e.g. use (?<![\p{L}\p{N}\p{Pc}\p{Pd}]) and
(?![\p{L}\p{N}\p{Pc}\p{Pd}]) ) so keywords will only match as standalone tokens,
keeping the existing escaping logic and 'u' flag in new RegExp.

---

Nitpick comments:
In `@src/core/schemas/base.schema.ts`:
- Line 3: The duplicated keyword error formatting in base.schema (using
formatKeywordMessage) should be replaced with the single canonical helper
VALIDATION_MESSAGES.REQUIREMENT_NO_SHALL to avoid divergence; locate the call
sites that currently invoke formatKeywordMessage (and any associated
buildKeywordRegex/resolveKeywords usage) and refactor them to call
VALIDATION_MESSAGES.REQUIREMENT_NO_SHALL(...) instead, passing the same
keyword/context parameters so the output remains identical, and remove the local
duplicate formatter import to ensure all schema and validator messages come from
the single source of truth.

In `@test/core/i18n-keywords.test.ts`:
- Around line 76-79: The test expects a comma before "or" for two keywords but
standard style omits that comma; update the implementation in
formatKeywordMessage to special-case when keywords.length === 2 and join as '"X"
or "Y"' (no comma), and then update the corresponding test case "should format
two keywords" to expect 'Requirement must contain "MUST" or "SHALL"';
alternatively, if the comma is intentional, add a clarifying comment in
formatKeywordMessage explaining the style choice and keep the test
as-is—reference formatKeywordMessage and the "should format two keywords" test
to locate the changes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cbda8364-7403-40cd-9fc3-513c6dc294d5

📥 Commits

Reviewing files that changed from the base of the PR and between afdca0d and 6cd2538.

📒 Files selected for processing (24)
  • README.md
  • docs/multi-language.md
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/.openspec.yaml
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/design.md
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/proposal.md
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/specs/cli-validate/spec.md
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/specs/config-loading/spec.md
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/specs/i18n-normative-keywords/spec.md
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/tasks.md
  • openspec/specs/cli-validate/spec.md
  • openspec/specs/config-loading/spec.md
  • openspec/specs/i18n-normative-keywords/spec.md
  • src/commands/validate.ts
  • src/core/i18n/keywords.ts
  • src/core/project-config.ts
  • src/core/schemas/base.schema.ts
  • src/core/schemas/change.schema.ts
  • src/core/schemas/index.ts
  • src/core/schemas/spec.schema.ts
  • src/core/validation/constants.ts
  • src/core/validation/validator.ts
  • test/core/i18n-keywords.test.ts
  • test/core/project-config.test.ts
  • test/core/validation.test.ts

Comment on lines +25 to +28
export function buildKeywordRegex(keywords: string[]): RegExp {
const escaped = keywords.map(k => k.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
const pattern = `(?<!\\p{L})(${escaped.join('|')})(?!\\p{L})`;
return new RegExp(pattern, 'u');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
node <<'NODE'
const keywords = ['MUST', 'SHALL'];
const escaped = keywords.map(k => k.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
const re = new RegExp(`(?<!\\p{L})(${escaped.join('|')})(?!\\p{L})`, 'u');

for (const text of [
  'The system MUST comply.',
  'Integrates with MUST-API.',
  'Connect to SHALL_v2.',
  'Token MUST2 is reserved.',
]) {
  console.log(`${re.test(text)}\t${text}`);
}
NODE

Repository: Fission-AI/OpenSpec

Length of output: 178


🏁 Script executed:

cat -n src/core/i18n/keywords.ts | head -35

Repository: Fission-AI/OpenSpec

Length of output: 1447


🏁 Script executed:

rg "buildKeywordRegex|resolveKeywords" -A 3 -B 1

Repository: Fission-AI/OpenSpec

Length of output: 10057


🏁 Script executed:

node <<'NODE'
// Current broken regex
const keywords = ['MUST', 'SHALL'];
const escaped = keywords.map(k => k.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
const brokenRe = new RegExp(`(?<!\\p{L})(${escaped.join('|')})(?!\\p{L})`, 'u');

// Proposed fix
const boundary = `[\\p{L}\\p{N}\\p{M}\\p{Pc}-]`;
const fixedRe = new RegExp(`(?<!${boundary})(?:${escaped.join('|')})(?!${boundary})`, 'u');

// Test cases from review (false positives in current)
const falsePositives = [
  'Integrates with MUST-API.',
  'Connect to SHALL_v2.',
  'Token MUST2 is reserved.',
];

// Test cases that should still pass
const validMatches = [
  'The system MUST comply.',
  'The system SHALL do something',
  'MUST do something',
  'The system MUST',
  'El sistema DEBERÁ.',
  'DEBE y DEBERÁ funcionar',
];

// Test cases that should NOT match
const invalidMatches = [
  'MUSTARD is a condiment',
  'El sistema DEBERÁ emitir', // with DEBE only regex
];

console.log('=== FALSE POSITIVES (should be false with fix) ===');
for (const text of falsePositives) {
  console.log(`Current: ${brokenRe.test(text)} | Fixed: ${fixedRe.test(text)} | ${text}`);
}

console.log('\n=== VALID MATCHES (should remain true) ===');
for (const text of validMatches) {
  console.log(`Current: ${brokenRe.test(text)} | Fixed: ${fixedRe.test(text)} | ${text}`);
}

console.log('\n=== INVALID MATCHES (should remain false) ===');
for (const text of invalidMatches) {
  console.log(`Current: ${brokenRe.test(text)} | Fixed: ${fixedRe.test(text)} | ${text}`);
}
NODE

Repository: Fission-AI/OpenSpec

Length of output: 796


🏁 Script executed:

# Test with Spanish keywords to ensure accents still work
node <<'NODE'
const keywords = ['DEBE', 'DEBERA', 'DEBERÁ'];
const escaped = keywords.map(k => k.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
const fixedRe = new RegExp(`(?<![\\p{L}\\p{N}\\p{M}\\p{Pc}-])(?:${escaped.join('|')})(?![\\p{L}\\p{N}\\p{M}\\p{Pc}-])`, 'u');

const tests = [
  'El sistema DEBE emitir un token',
  'El sistema DEBERA emitir un token',
  'El sistema DEBERÁ emitir un token',
  'El sistema DEBERÁ.',
  'DEBE y DEBERÁ funcionar',
];

console.log('Spanish keywords with fixed regex:');
for (const text of tests) {
  console.log(`${fixedRe.test(text)}\t${text}`);
}
NODE

Repository: Fission-AI/OpenSpec

Length of output: 265


Don't let identifiers like MUST-API satisfy the normative-keyword check.

(?<!\p{L})... (?!\p{L}) only excludes adjacent letters, so configured keywords still match inside tokens such as MUST-API, SHALL_v2, or MUST2. That allows non-normative requirement text to pass validation.

Recommended fix
 export function buildKeywordRegex(keywords: string[]): RegExp {
   const escaped = keywords.map(k => k.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
-  const pattern = `(?<!\\p{L})(${escaped.join('|')})(?!\\p{L})`;
+  const boundary = `[\\p{L}\\p{N}\\p{M}\\p{Pc}-]`;
+  const pattern = `(?<!${boundary})(?:${escaped.join('|')})(?!${boundary})`;
   return new RegExp(pattern, 'u');
 }
🧰 Tools
🪛 ast-grep (0.41.1)

[warning] 27-27: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(pattern, 'u')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/i18n/keywords.ts` around lines 25 - 28, The current
buildKeywordRegex in function buildKeywordRegex uses (?<!\p{L})... (?!\p{L})
which only blocks adjacent letters and still matches inside tokens like MUST-API
or MUST2; update the lookarounds around pattern (the pattern variable) to assert
that the preceding and following character are NOT any Unicode letter, number,
connector punctuation or dash punctuation (e.g. use
(?<![\p{L}\p{N}\p{Pc}\p{Pd}]) and (?![\p{L}\p{N}\p{Pc}\p{Pd}]) ) so keywords
will only match as standalone tokens, keeping the existing escaping logic and
'u' flag in new RegExp.

Comment on lines +50 to +54
export function resolveKeywords(language?: string): string[] {
if (!language) return NORMATIVE_KEYWORDS[DEFAULT_LANGUAGE];
const keywords = NORMATIVE_KEYWORDS[language];
if (keywords) return keywords;
return NORMATIVE_KEYWORDS[DEFAULT_LANGUAGE];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Surface unsupported-language fallback instead of silently validating in English.

This is an exact, silent lookup. Because the CLI now forwards config.language verbatim, values like ES, xx, or whitespace-only strings fall back to English keywords with no warning, so users can think i18n is enabled when it is not. The PR objectives also call out a fallback warning, but this path never exposes one. Normalize before lookup and return enough signal for the caller to warn on fallback.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/i18n/keywords.ts` around lines 50 - 54, In resolveKeywords,
normalize the incoming language (e.g., language = language.trim().toLowerCase())
before lookup and change the fallback behavior so that when a language string is
provided but not found in NORMATIVE_KEYWORDS the function returns undefined
(instead of silently returning NORMATIVE_KEYWORDS[DEFAULT_LANGUAGE]); keep the
existing behavior of returning NORMATIVE_KEYWORDS[DEFAULT_LANGUAGE] only when
language is omitted/undefined. This uses resolveKeywords, NORMATIVE_KEYWORDS and
DEFAULT_LANGUAGE to allow callers to detect unsupported languages and warn.

claude and others added 2 commits March 14, 2026 13:13
- Fix malformed regex typo in tasks.md: (?\p{L}) → (?!\p{L})
- Remove editing artifact '← (was: hardcoded SHALL)' from cli-validate spec
- Add console.warn for unknown language codes in resolveKeywords() per spec
- Fix comma grammar in formatKeywordMessage() for two-keyword case
- Update REQUIREMENT_NO_SHALL in constants.ts to match
- Update tests for warning assertion and corrected message format

https://claude.ai/code/session_011nfbaV82D5YRVT1jQbWDC8
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/core/i18n/keywords.ts (1)

25-28: ⚠️ Potential issue | 🟠 Major

Broaden token boundaries so identifier fragments don’t satisfy normative-keyword validation.

Line 27 only blocks adjacent letters, so keywords still match inside tokens like MUST-API, MUST2, or MUST_v2, which can incorrectly pass validation.

Suggested fix
 export function buildKeywordRegex(keywords: string[]): RegExp {
   const escaped = keywords.map(k => k.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
-  const pattern = `(?<!\\p{L})(${escaped.join('|')})(?!\\p{L})`;
+  const boundary = `[\\p{L}\\p{N}\\p{M}\\p{Pc}\\p{Pd}]`;
+  const pattern = `(?<!${boundary})(?:${escaped.join('|')})(?!${boundary})`;
   return new RegExp(pattern, 'u');
 }
In ECMAScript RegExp with Unicode property escapes, does `(?<!\p{L})MUST(?!\p{L})` match inside `MUST-API`, `MUST2`, or `MUST_v2`?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/core/i18n/keywords.ts` around lines 25 - 28, The regex in
buildKeywordRegex only prevents adjacent letters but still allows matches inside
tokens like MUST-API, MUST2, or MUST_v2; update the lookbehind/lookahead to
block letters, numbers and token connector chars instead of only \p{L}. Replace
the pattern variable (pattern = `(?<!\\p{L})(${escaped.join('|')})(?!\\p{L})`)
with one that uses a character class such as [\p{L}\p{N}\p{Pc}\p{Pd}_-] in both
lookarounds (e.g.
`(?<![\\p{L}\\p{N}\\p{Pc}\\p{Pd}_-])(...)(?![\\p{L}\\p{N}\\p{Pc}\\p{Pd}_-])`) so
keywords won’t match when adjacent to letters, digits, connector punctuation or
dashes.
🧹 Nitpick comments (1)
test/core/validation.test.ts (1)

676-733: Add a regression case for configured keywords embedded in identifiers.

Please add explicit cases like DEBE-API, DEBE2, and DEBE_v2 in Spanish mode to ensure token-boundary behavior stays correct over time.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/core/validation.test.ts` around lines 676 - 733, Add regression tests in
the same suite that verify token-boundary behavior for configured keywords
embedded inside identifiers by creating a new changeDir (e.g.,
"test-change-es-keyword-embedded") and writing a spec with requirement text
containing DEBE-API, DEBE2, and DEBE_v2 but no standalone "DEBE", then call
Validator(false, 'es').validateChangeDeltaSpecs(changeDir) and assert the report
is invalid (report.valid === false and report.summary.errors > 0); also add a
companion test that includes a standalone "DEBE" alongside those embedded
variants and assert the report is valid (report.valid === true and
report.summary.errors === 0) to ensure the validator (Validator and its
validateChangeDeltaSpecs method) treats only token-boundary matches as
normative.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/core/i18n/keywords.ts`:
- Around line 25-28: The regex in buildKeywordRegex only prevents adjacent
letters but still allows matches inside tokens like MUST-API, MUST2, or MUST_v2;
update the lookbehind/lookahead to block letters, numbers and token connector
chars instead of only \p{L}. Replace the pattern variable (pattern =
`(?<!\\p{L})(${escaped.join('|')})(?!\\p{L})`) with one that uses a character
class such as [\p{L}\p{N}\p{Pc}\p{Pd}_-] in both lookarounds (e.g.
`(?<![\\p{L}\\p{N}\\p{Pc}\\p{Pd}_-])(...)(?![\\p{L}\\p{N}\\p{Pc}\\p{Pd}_-])`) so
keywords won’t match when adjacent to letters, digits, connector punctuation or
dashes.

---

Nitpick comments:
In `@test/core/validation.test.ts`:
- Around line 676-733: Add regression tests in the same suite that verify
token-boundary behavior for configured keywords embedded inside identifiers by
creating a new changeDir (e.g., "test-change-es-keyword-embedded") and writing a
spec with requirement text containing DEBE-API, DEBE2, and DEBE_v2 but no
standalone "DEBE", then call Validator(false,
'es').validateChangeDeltaSpecs(changeDir) and assert the report is invalid
(report.valid === false and report.summary.errors > 0); also add a companion
test that includes a standalone "DEBE" alongside those embedded variants and
assert the report is valid (report.valid === true and report.summary.errors ===
0) to ensure the validator (Validator and its validateChangeDeltaSpecs method)
treats only token-boundary matches as normative.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 869f1825-b1e6-4bd4-82a7-e8427a821772

📥 Commits

Reviewing files that changed from the base of the PR and between 6cd2538 and 66a03ac.

📒 Files selected for processing (6)
  • openspec/changes/archive/2026-03-13-add-i18n-normative-keywords/tasks.md
  • openspec/specs/cli-validate/spec.md
  • src/core/i18n/keywords.ts
  • src/core/validation/constants.ts
  • test/core/i18n-keywords.test.ts
  • test/core/validation.test.ts
✅ Files skipped from review due to trivial changes (1)
  • test/core/i18n-keywords.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/core/validation/constants.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants