Skip to content

feat: render images inside Word textboxes (SD-2804)#3207

Open
tupizz wants to merge 1 commit intomainfrom
tadeu/sd-2804-feature-render-images-inside-textbox
Open

feat: render images inside Word textboxes (SD-2804)#3207
tupizz wants to merge 1 commit intomainfrom
tadeu/sd-2804-feature-render-images-inside-textbox

Conversation

@tupizz
Copy link
Copy Markdown
Contributor

@tupizz tupizz commented May 7, 2026

Summary

Renders inline w:drawing images inside Word textbox content. Previously, the textbox imported with the image silently stripped β€” the textbox rendered as an empty box even though export round-tripped the image untouched.

Linear: SD-2804

Spec basis

ECMA-376 Β§20.4.2.38 (CT_TxbxContent) defines textbox content as EG_BlockLevelElts (1..unbounded) β€” i.e. a textbox can hold the same content as the document body, with three exclusions: cross-story refs (comments/footnotes/endnotes), VML, and nested txbxContent. Notably, paragraphs inside w:txbxContent carry the same CT_P content model as body paragraphs, including runs with inline w:drawing images.

The text-only extractor used in extractTextFromTextBox.handleRun only walked w:t / w:tab / w:br / sd:autoPageNumber / sd:totalPageNumber β€” w:drawing was silently ignored.

Approach

Minimum surgical change β€” extend the existing text-parts model with one image part kind:

  1. TextPart contract gains optional kind: 'image' plus src / width / height / alt.
  2. Importer (extractTextFromTextBox.handleRun) branches on w:drawing, reuses the v3 handleImageNode to resolve r:embed β†’ media path, then upgrades the path to a data URI from converter.media (the text-parts model has no downstream hydration step like body ImageRuns do).
  3. Painter (createFallbackTextElement) renders parts with kind: 'image' as inline <img> next to text spans.

No new PM nodes, no new pm-adapter wiring, no schema changes, no NodeHandlerContext threading.

Before / after

Fixture: a DOCX with a textbox-in-header containing a single inline image.

Before After
Empty textbox outline; image silently dropped on import Image renders inline inside the textbox, matching what Word shows

Captured via agent-browser against the dev server: see /tmp/sd-2804-final3.png.

Test plan

  • Unit test: importer emits an image part in textContent.parts for an inline w:drawing inside the textbox (encode-image-node-helpers.test.js)
  • super-editor full suite: 12,645 tests passing
  • painter-dom full suite: 1,064 tests passing
  • pm-adapter full suite: 1,788 tests passing
  • layout-bridge full suite: 1,206 tests passing
  • Browser: upload the SD-2804 fixture, confirm the image renders inside the textbox

Out of scope (deferred)

The fixture's image is wp:inline inside a textbox run β€” the most common case. ECMA-376 also permits richer block-level content inside a textbox: tables, lists, SDTs, hyperlinks, fields, math. The current text-parts model can't represent those; surfacing them would need to flow w:txbxContent through the same body pipeline (handleParagraphNode recursion) and likely a container PM node (shapeTextbox schema already exists for the legacy v:pict path).

That refactor is intentionally deferred β€” the supplied SD-2804 fixture has only an image, and Option B was over-engineering for the immediate user-visible bug. Tracking issue / future PR scope for content beyond inline-image-in-textbox.

Related

  • SD-2745 (header-anchored floating textboxes) β€” handles textbox position; this PR handles textbox content. They compose.

…804)

ECMA-376 Β§20.4.2.38 (CT_TxbxContent) lets a textbox hold rich body-level
content β€” paragraphs whose runs can carry inline w:drawing images. The
text-only extractor used to silently skip those drawings, so the textbox
rendered empty even though export round-tripped the image untouched.

The fix surfaces the inline drawing as a textContent part with kind='image'
so the existing shape painter can render it alongside text spans:

- TextPart contract gains optional kind/src/width/height/alt fields.
- extractTextFromTextBox.handleRun branches on w:drawing, reuses the v3
  wp drawing handler (handleImageNode) to resolve rId, then upgrades the
  path-style src to a data URI from converter.media so the painter can
  drop it straight into <img>.
- DomPainter's createFallbackTextElement renders image parts as inline
  <img> elements next to existing text spans.

Linked: SD-2745 (header-anchored floating textboxes β€” positions the box
where this content now renders).
@linear
Copy link
Copy Markdown

linear Bot commented May 7, 2026

SD-2804

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

I wasn't granted permissions for the ecma-spec MCP tools, so I reviewed the diff against my knowledge of ECMA-376 (Part 1, Β§17 WordprocessingML and Β§20.4 DrawingML-WordprocessingDrawing).

Status: PASS

The OOXML element handling in this PR is spec-compliant:

  • w:drawing is a valid child of w:r (run inner content per CT_R), so processing it inside the run-element loop is correct.
  • The handler correctly looks for wp:inline or wp:anchor as the direct child of w:drawing β€” those are the two valid choices in CT_Drawing.
  • The test fixture's element nesting wps:wsp β†’ wps:txbx β†’ w:txbxContent β†’ w:p β†’ w:r β†’ w:drawing β†’ wp:inline β†’ a:graphic β†’ a:graphicData[uri] β†’ pic:pic β†’ pic:blipFill β†’ a:blip[r:embed] matches the schemas (CT_WordprocessingShape, CT_TxbxContent, CT_Inline, CT_GraphicalObject, CT_Picture, CT_BlipFillProperties).
  • Required attributes are present where needed: wp:docPr has id and name; a:graphicData has uri; a:blip has r:embed; wp:extent has cx/cy. The optional dist* attributes on wp:inline are correctly omitted (they have schema defaults of 0).
  • wps:cNvSpPr@txBox="1" is the correct marker for a text-bearing shape per the wordprocessingShape (wps:) namespace.

One minor non-blocking note: the comment cites "ECMA-376 Β§20.4.2.38" for CT_TxbxContent, but txbxContent lives in the WordprocessingML namespace (Part 1, Β§17), not in DrawingML-WordprocessingDrawing (Β§20.4). I couldn't verify the exact section number without spec access, so worth a quick double-check β€” but it's a comment, not a code issue.

@tupizz tupizz marked this pull request as ready for review May 7, 2026 23:18
@tupizz tupizz requested a review from a team as a code owner May 7, 2026 23:18
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

βœ… All modified and coverable lines are covered by tests.

πŸ“’ Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants