Skip to content

feat(pm-adapter): preserve run-level bidi/script metadata on TextRun (SD-2781)#3203

Open
caio-pizzol wants to merge 4 commits intomainfrom
caio/sd-2781-textrun-bidi-script-metadata
Open

feat(pm-adapter): preserve run-level bidi/script metadata on TextRun (SD-2781)#3203
caio-pizzol wants to merge 4 commits intomainfrom
caio/sd-2781-textrun-bidi-script-metadata

Conversation

@caio-pizzol
Copy link
Copy Markdown
Contributor

Adds preservation-only run-level direction and script metadata to the layout text-run contract, kept on separate axes per ECMA's own categorization.

What changes

Two fields on `TextRun`:

  • `bidi: RunBidiContext` - direction signals only. Carries the run rtl flag now (§17.3.2.30); space reserved for w:dir embedding and w:bdo override in Wave 1c.
  • `script: RunScriptContext` - complex-script context. Carries the cs flag (§17.3.2.5) and per-script language tags (default / complexScript / eastAsian per §17.3.2.20).

Both populated by pm-adapter in `applyInlineRunProperties` from raw `RunProperties` when present. Empty contexts are not attached, so the layout tree doesn't bloat for runs with no signals.

Why typed split

ECMA-376 puts direction (`rtl`, `dir`, `bdo`) and script formatting (`cs`, `lang/@bidi`) in different categories.

  • Direction signals control reading order and embedding levels.
  • Script signals control which formatting stack applies (Latin variants vs CS variants vs East Asian).

Lumping them under one `bidi` field would collapse the axes and lie about the schema. The non-collapse test exercises this directly: setting `rtl: true, cs: true` produces `bidi.rtl === true` and `script.complexScript === true` with no leakage.

Why now

Wave 1b will consume `script.complexScript` to gate the formatting-stack selection; Wave 1c will consume `bidi.embedding` and `bidi.override`. Adding the data path now means each wave doesn't have to introduce both the data and the rendering at once.

Type shape change

`RunScriptContext.language` expanded from `string` to `{ default?, complexScript?, eastAsian? }`. No production consumer reads the field yet (preservation-only since #3184), so the change is safe.

What this PR does NOT do

  • DomPainter does not render anything based on either field.
  • Round-trip is unaffected (preservation-only, no behavior change).
  • Wave 1c's `w:dir` / `w:bdo` import isn't done yet - the type has slots for those signals, but the importer doesn't populate them.

Validation

  • 1,794 pm-adapter tests pass (+7 new tests covering the preservation, partial lang attrs, and axis non-collapse)
  • 12,644 super-editor tests pass
  • 1,201 layout-bridge tests pass

…(SD-2781)

Adds two preservation-only fields to the layout text-run contract, kept on
separate axes per ECMA's own categorization:

- TextRun.bidi (RunBidiContext): direction signals only - run rtl flag now,
  embedding (w:dir) and override (w:bdo) wired in Wave 1c.
- TextRun.script (RunScriptContext): complex-script flag + per-script
  language metadata (default / complexScript / eastAsian) per §17.3.2.20.

Both populated by pm-adapter from raw run properties when present. Wave 1a
does not render either; Wave 1b will gate the formatting-stack selection on
script.complexScript, Wave 1c will read bidi.embedding/override.

Why now: ECMA puts direction (rtl, dir, bdo) and script formatting (cs,
lang/@bidi) in different categories. Lumping them under one bidi field
would collapse the axes and lie about the schema. Adding both fields now
means Wave 1b/1c don't have to introduce both the data path and the
rendering at once.

The RunScriptContext.language field expanded from a single string to a
structured object with three optional tags. No production consumer reads
the field yet (preservation-only since #3184), so the shape change is
safe.

Tests:
- bidi populated only on rtl, script populated only on cs/lang
- explicit rtl=false preserved (a meaningful override)
- the three lang tags land on separate fields
- axis non-collapse: rtl never leaks into script, cs never leaks into bidi
@linear
Copy link
Copy Markdown

linear Bot commented May 7, 2026

SD-2781

…TextRun

The previous commit added `bidi?: RunBidiContext` and `script?: RunScriptContext`
to TextRun in contracts/src/index.ts but only re-exported the type names; they
were not in local type scope, so `tsc --project` failed:

  src/index.ts(324,10): error TS2304: Cannot find name 'RunBidiContext'.
  src/index.ts(330,12): error TS2304: Cannot find name 'RunScriptContext'.

Vitest's transform doesn't enforce type-only imports the way tsc does, so the
unit tests passed even though the build was broken. CI build would have caught
it; the missing piece was running `pnpm build` locally before pushing.

Adds the two names to the existing local `import type` statement alongside
ParagraphDirectionContext (same pattern, same line 19).

Also adds two integration tests in pm-adapter/src/integration.test.ts that
exercise the full PM -> FlowBlock conversion through the unmocked
applyInlineRunProperties pipeline. The previous unit tests in
common.test.ts mock computeRunAttrs, which is why they couldn't catch
shape-level regressions in the contracts package. The integration tests
prove a real PM doc with raw runProperties (rtl/cs/lang on a run-wrapper
node) produces a TextRun with populated bidi/script, and that runs without
signals don't gain empty objects.
@caio-pizzol caio-pizzol marked this pull request as ready for review May 7, 2026 21:54
@caio-pizzol caio-pizzol requested a review from a team as a code owner May 7, 2026 21:54
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7cf49dae8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/layout-engine/pm-adapter/src/converters/inline-converters/common.ts Outdated
Comment thread packages/layout-engine/pm-adapter/src/converters/inline-converters/common.ts Outdated
Repo convention puts the ticket reference in the `describe` label only
(e.g., `describe('SD-1333: ...')`). Trims the redundant `SD-2781:` prefix
from the block comment, keeps the non-obvious "Wave 1a preserves these
signals; nothing renders them yet" note.
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…ssign on token runs

Two codex findings on PR #3203:

1. Cascade-leak: applyInlineRunProperties was receiving cascade-resolved
   runProperties from runNodeChildrenToRuns and populating bidi/script from
   them. Style-inherited runs ended up with bidi/script metadata they did not
   have inline, making preservation indistinguishable from direct formatting
   and bloating the layout tree on every styled run.

   Fix: thread the raw inline runProperties through InlineConverterParams
   (alongside the existing cascade-resolved runProperties). applyInlineRunProperties
   gains a fourth parameter that bidi/script populate from. When the caller
   doesn't opt in (no inline parameter), no metadata is attached - safer
   default than reading the cascaded view.

2. Token-run drop: generic-token.ts called applyInlineRunProperties without
   reassigning the return value. Since the helper builds a new object via
   spread, all merged fields (including the SD-2781 bidi/script) were lost
   for page-number / total-page-count token runs inside an rtl run wrapper.

   Fix: change the local to `let` and reassign. Also forwards inlineRunProperties
   through. no-break-hyphen.ts (another caller) updated for consistency.

Tests added:
- 3 unit tests in common.test.ts proving bidi/script populate from
  inlineRunProperties only, never from cascade-resolved runProperties
- Existing 7 SD-2781 unit tests updated to pass inlineRunProperties
  (the new opt-in source)
- 1 integration test proving bidi/script propagate to page-number tokens
  inside an rtl run wrapper

All 1800 pm-adapter tests, 12644 super-editor tests, and 1201 layout-bridge
tests pass. Contracts + pm-adapter both build clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants