feat: render images inside Word textboxes (SD-2804)#3207
Open
Conversation
β¦804) ECMA-376 Β§20.4.2.38 (CT_TxbxContent) lets a textbox hold rich body-level content β paragraphs whose runs can carry inline w:drawing images. The text-only extractor used to silently skip those drawings, so the textbox rendered empty even though export round-tripped the image untouched. The fix surfaces the inline drawing as a textContent part with kind='image' so the existing shape painter can render it alongside text spans: - TextPart contract gains optional kind/src/width/height/alt fields. - extractTextFromTextBox.handleRun branches on w:drawing, reuses the v3 wp drawing handler (handleImageNode) to resolve rId, then upgrades the path-style src to a data URI from converter.media so the painter can drop it straight into <img>. - DomPainter's createFallbackTextElement renders image parts as inline <img> elements next to existing text spans. Linked: SD-2745 (header-anchored floating textboxes β positions the box where this content now renders).
Contributor
|
I wasn't granted permissions for the ecma-spec MCP tools, so I reviewed the diff against my knowledge of ECMA-376 (Part 1, Β§17 WordprocessingML and Β§20.4 DrawingML-WordprocessingDrawing). Status: PASS The OOXML element handling in this PR is spec-compliant:
One minor non-blocking note: the comment cites "ECMA-376 Β§20.4.2.38" for |
Codecov Reportβ All modified and coverable lines are covered by tests. π’ Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Renders inline
w:drawingimages inside Word textbox content. Previously, the textbox imported with the image silently stripped β the textbox rendered as an empty box even though export round-tripped the image untouched.Linear: SD-2804
Spec basis
ECMA-376 Β§20.4.2.38 (
CT_TxbxContent) defines textbox content asEG_BlockLevelElts (1..unbounded)β i.e. a textbox can hold the same content as the document body, with three exclusions: cross-story refs (comments/footnotes/endnotes), VML, and nestedtxbxContent. Notably, paragraphs insidew:txbxContentcarry the sameCT_Pcontent model as body paragraphs, including runs with inlinew:drawingimages.The text-only extractor used in
extractTextFromTextBox.handleRunonly walkedw:t / w:tab / w:br / sd:autoPageNumber / sd:totalPageNumberβw:drawingwas silently ignored.Approach
Minimum surgical change β extend the existing text-parts model with one image part kind:
TextPartcontract gains optionalkind: 'image'plussrc / width / height / alt.extractTextFromTextBox.handleRun) branches onw:drawing, reuses the v3handleImageNodeto resolver:embed β media path, then upgrades the path to a data URI fromconverter.media(the text-parts model has no downstream hydration step like bodyImageRuns do).createFallbackTextElement) renders parts withkind: 'image'as inline<img>next to text spans.No new PM nodes, no new pm-adapter wiring, no schema changes, no NodeHandlerContext threading.
Before / after
Fixture: a DOCX with a textbox-in-header containing a single inline image.
Captured via agent-browser against the dev server: see
/tmp/sd-2804-final3.png.Test plan
textContent.partsfor an inlinew:drawinginside the textbox (encode-image-node-helpers.test.js)Out of scope (deferred)
The fixture's image is
wp:inlineinside a textbox run β the most common case. ECMA-376 also permits richer block-level content inside a textbox: tables, lists, SDTs, hyperlinks, fields, math. The current text-parts model can't represent those; surfacing them would need to floww:txbxContentthrough the same body pipeline (handleParagraphNoderecursion) and likely a container PM node (shapeTextboxschema already exists for the legacy v:pict path).That refactor is intentionally deferred β the supplied SD-2804 fixture has only an image, and Option B was over-engineering for the immediate user-visible bug. Tracking issue / future PR scope for content beyond inline-image-in-textbox.
Related