Fix Bug 66263 — Add support for reading SDT row in tables by hostedbygnome · Pull Request #971 · apache/poi

hostedbygnome · 2025-12-16T09:04:50Z

The issue is that there's an SDT at the same level as the table row.
Added support for this case.

CTR elements inside CTSdtRun and CTRow elements inside CTSdtRow are processed recursively.

pjfanning · 2025-12-16T11:03:57Z

doesn't even compile
no tests

pjfanning · 2025-12-18T13:09:17Z

would it be possible to provide a test?

hostedbygnome · 2026-04-06T16:39:23Z

would it be possible to provide a test?

yes, sorry for the delay

hostedbygnome · 2026-04-20T05:35:49Z

Is there anything else required in pr?
@pjfanning

Copilot

Pull request overview

Adds support for handling Word SDT (content control) structures that appear at the same level as table rows and runs, so their inner content can be discovered during parsing and text extraction (Bug 66263).

Changes:

Update XWPFTable to iterate table child elements and recursively process CTSdtRow to extract inner CTRow content.
Update XWPFParagraph to additionally process CTR elements inside CTSdtBlock / CTSdtRun content.
Add regression tests and new sample .docx files for SDT rows in tables and SDT runs in paragraphs.

Reviewed changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
`poi-ooxml/src/main/java/org/apache/poi/xwpf/usermodel/XWPFTable.java`	Cursor-based traversal of table children; recursive SDT-row handling
`poi-ooxml/src/main/java/org/apache/poi/xwpf/usermodel/XWPFParagraph.java`	Processes CTRs nested inside SDT content when building run lists
`poi-ooxml/src/test/java/org/apache/poi/xwpf/usermodel/TestXWPFTable.java`	New unit tests for SDT-row table behavior and sample-doc regression
`poi-ooxml/src/test/java/org/apache/poi/xwpf/usermodel/TestXWPFParagraph.java`	New unit tests for SDT run handling + sample-doc regression
`test-data/document/Bug66263-table.docx`	Added sample document covering SDT row in table
`test-data/document/Bug66263-paragraph.docx`	Added sample document covering SDT runs in paragraphs

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-22T10:33:59Z

+        try (XmlCursor cursor = table.newCursor()) {
+            cursor.selectPath("./*");
+            while (cursor.toNextSelection()) {
+                XmlObject xmlObject = cursor.getObject();
+                if (xmlObject instanceof CTRow) {
+                    processCTRow((CTRow)xmlObject);
+                }
+                else if (xmlObject instanceof CTSdtRow) {
+                    List<CTRow> rows = new ArrayList<>();
+                    collectCTRowsInnerSdtRow((CTSdtRow)xmlObject, rows);
+                    for (CTRow row : rows)
+                    {
+                        processCTRow(row);
                    }
-                    rowText.append(p.getText());
                }
            }


The constructor now adds CTRow instances extracted from CTSdtRow into tableRows. Many mutation/access methods (e.g., removeRow, insertNewTableRow, addRow, getRow(int), getNumberOfRows) assume tableRows indexes map 1:1 to ctTbl’s top-level w:tr array. With SDT-derived rows included, removeRow(pos) will remove the wrong underlying w:tr (or throw) and getRow(int)/getNumberOfRows() become inconsistent with getRows(). Consider either (a) keeping tableRows aligned to top-level w:tr only and handling SDT rows separately for text extraction, or (b) refactoring row operations to remove/insert via the actual row element’s cursor (or SDT wrapper) rather than ctTbl.removeTr(pos) and updating getRow/getNumberOfRows semantics accordingly.

Copilot · 2026-04-22T10:34:00Z

+        List<CTRow> rowsInnerSdtContent = sdtContent.getTrList();
+        if (!rowsInnerSdtContent.isEmpty()) {
+            rows.addAll(rowsInnerSdtContent);
+            return;
+        }
+
+        for (CTSdtRow innerSdt : sdtContent.getSdtList()) {
+            collectCTRowsInnerSdtRow(innerSdt, rows);


collectCTRowsInnerSdtRow returns immediately when sdtContent.getTrList() is non-empty, which skips any nested w:sdt elements that may appear alongside w:tr within the same CTSdtContentRow. This will miss rows in mixed-content SDTs and also loses document order between tr and nested sdt. Iterate through the SDT content children in order (e.g., via an XmlCursor over sdtContent) and collect both CTRow and nested CTSdtRow recursively without the early return.

Suggested change

List<CTRow> rowsInnerSdtContent = sdtContent.getTrList();

if (!rowsInnerSdtContent.isEmpty()) {

rows.addAll(rowsInnerSdtContent);

return;

}

for (CTSdtRow innerSdt : sdtContent.getSdtList()) {

collectCTRowsInnerSdtRow(innerSdt, rows);

XmlCursor cursor = sdtContent.newCursor();

try {

if (!cursor.toFirstChild()) {

return;

}

do {

XmlObject child = cursor.getObject();

if (child instanceof CTRow) {

rows.add((CTRow) child);

} else if (child instanceof CTSdtRow) {

collectCTRowsInnerSdtRow((CTSdtRow) child, rows);

}

} while (cursor.toNextSibling());

} finally {

cursor.dispose();

Copilot · 2026-04-22T10:34:00Z

        // is an empty table: I add one row and one column as default
        if (initRow && table.sizeOfTrArray() == 0) {
            createEmptyTable(table);
        }


The "empty table" check only looks at sizeOfTrArray() == 0. A table that contains only w:sdt row wrappers has no top-level w:tr, but it isn’t actually empty; createEmptyTable(table) will inject an extra blank row when using the default constructor (initRow=true). Consider extending the condition to also check for the presence of SDT rows (e.g., sizeOfSdtArray()==0/getSdtList().isEmpty()), or otherwise detect whether the table already contains any row-like content before creating a default row.

Copilot · 2026-04-22T10:34:00Z

+    private void processCTRs(List<CTR> ctrs) {
+        if (ctrs == null) {
+            return;
+        }
+        for (CTR ctr : ctrs) {
+            if (ctr.getRPr() != null) {


processCTRs adds XWPFRun objects for CTRs inside SDT content into the paragraph’s runs list. Many editing APIs (insertNewRun, insertNewHyperlinkRun, etc.) assume every entry in runs corresponds to a CTR whose cursor is in the paragraph (or hyperlink/field) container; for CTRs nested under CTSdtContentRun, insertNewProvidedRun will fail the isCursorInParagraph check and return null for positions that point at these SDT-derived runs. Either avoid adding nested SDT CTRs into runs (keep SDT content accessible via XWPFSDT/iruns), or update the insertion/removal logic and cursor positioning to support runs whose parent is SDT content so that the runs list remains a valid edit surface.

Suggested change

private void processCTRs(List<CTR> ctrs) {

if (ctrs == null) {

return;

}

for (CTR ctr : ctrs) {

if (ctr.getRPr() != null) {

private boolean isRunInParagraphEditContainer(CTR ctr) {

try (XmlCursor cursor = ctr.newCursor()) {

if (!cursor.toParent()) {

return false;

}

XmlObject parent = cursor.getObject();

return parent instanceof CTP

|| parent instanceof CTHyperlink

|| parent instanceof CTSimpleField;

}

}

private void processCTRs(List<CTR> ctrs) {

if (ctrs == null) {

return;

}

for (CTR ctr : ctrs) {

if (ctr.getRPr() != null && isRunInParagraphEditContainer(ctr)) {

Copilot · 2026-04-22T10:34:01Z

+import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTSdtBlock;
+import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTSdtContentBlock;


These imports are unused in this test class (CTSdtBlock, CTSdtContentBlock) and will cause compilation to fail with "unused import" in Java. Remove them or add coverage that actually uses SDT blocks.

Suggested change

import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTSdtBlock;

import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTSdtContentBlock;

Fix Bug 66263 — Add support for SDT row in tables

85f4837

pjfanning reviewed Dec 16, 2025

View reviewed changes

Comment thread poi-ooxml/src/main/java/org/apache/poi/xwpf/usermodel/XWPFParagraph.java Outdated

remove repeated if

e0f3da1

add missing imports

5c75ebe

pjfanning changed the title ~~Fix Bug 66263 — Add support for SDT row in tables~~ Fix Bug 66263 — Add support for reading SDT row in tables Dec 18, 2025

mnovozhilov added 3 commits April 6, 2026 11:02

add tests

9ba7239

add tests

34f3fa6

add tests with file + fix paragraph duplication

fe8e05f

pjfanning reviewed Apr 6, 2026

View reviewed changes

Comment thread poi-ooxml/src/main/java/org/apache/poi/xwpf/usermodel/XWPFParagraph.java Outdated

fix indentation

d992752

pjfanning requested a review from Copilot April 22, 2026 10:26

Copilot started reviewing on behalf of pjfanning April 22, 2026 10:27 View session

Copilot AI reviewed Apr 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Bug 66263 — Add support for reading SDT row in tables#971

Fix Bug 66263 — Add support for reading SDT row in tables#971
hostedbygnome wants to merge 7 commits intoapache:trunkfrom
hostedbygnome:Bug-66263

hostedbygnome commented Dec 16, 2025

Uh oh!

Uh oh!

pjfanning commented Dec 16, 2025

Uh oh!

pjfanning commented Dec 18, 2025

Uh oh!

Uh oh!

hostedbygnome commented Apr 6, 2026 •

edited

Loading

Uh oh!

hostedbygnome commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Copilot AI Apr 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-        List<CTRow> rowsInnerSdtContent = sdtContent.getTrList();
-        if (!rowsInnerSdtContent.isEmpty()) {
-            rows.addAll(rowsInnerSdtContent);
-            return;
-        }
-        for (CTSdtRow innerSdt : sdtContent.getSdtList()) {
-            collectCTRowsInnerSdtRow(innerSdt, rows);
+        XmlCursor cursor = sdtContent.newCursor();
+        try {
+            if (!cursor.toFirstChild()) {
+                return;
+            }
+            do {
+                XmlObject child = cursor.getObject();
+                if (child instanceof CTRow) {
+                    rows.add((CTRow) child);
+                } else if (child instanceof CTSdtRow) {
+                    collectCTRowsInnerSdtRow((CTSdtRow) child, rows);
+                }
+            } while (cursor.toNextSibling());
+        } finally {
+            cursor.dispose();

-    private void processCTRs(List<CTR> ctrs) {
-        if (ctrs == null) {
-            return;
-        }
-        for (CTR ctr : ctrs) {
-            if (ctr.getRPr() != null) {
+    private boolean isRunInParagraphEditContainer(CTR ctr) {
+        try (XmlCursor cursor = ctr.newCursor()) {
+            if (!cursor.toParent()) {
+                return false;
+            }
+            XmlObject parent = cursor.getObject();
+            return parent instanceof CTP
+                    || parent instanceof CTHyperlink
+                    || parent instanceof CTSimpleField;
+        }
+    }
+    private void processCTRs(List<CTR> ctrs) {
+        if (ctrs == null) {
+            return;
+        }
+        for (CTR ctr : ctrs) {
+            if (ctr.getRPr() != null && isRunInParagraphEditContainer(ctr)) {

		import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTSdtBlock;
		import org.openxmlformats.schemas.wordprocessingml.x2006.main.CTSdtContentBlock;

Conversation

hostedbygnome commented Dec 16, 2025

Uh oh!

Uh oh!

pjfanning commented Dec 16, 2025

Uh oh!

pjfanning commented Dec 18, 2025

Uh oh!

Uh oh!

hostedbygnome commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hostedbygnome commented Apr 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hostedbygnome commented Apr 6, 2026 •

edited

Loading