Skip to content

Conversation

@MkDev11
Copy link
Contributor

@MkDev11 MkDev11 commented Jan 22, 2026

Add a utility function to group elements by their parent_id metadata field. This allows users to easily traverse document hierarchy by grouping elements that share the same parent.

Includes an optional 'assign_orphans' parameter that, when True, assigns elements with no parent_id to the same group as the previous element.

Fixes #1489

@MkDev11
Copy link
Contributor Author

MkDev11 commented Jan 22, 2026

@lawrence-u10d @badGarnet could you please take a look at the PR? thanks for your time and feedback.

@MkDev11
Copy link
Contributor Author

MkDev11 commented Jan 22, 2026

=== Test 1: assign_orphans=False (default) ===
  parent_A: ['Title 1', 'Child of A']
  None: ['Orphan 1', 'Orphan 2']
  parent_B: ['Title 2']

=== Test 2: assign_orphans=True ===
  parent_A: ['Title 1', 'Child of A', 'Orphan 1']
  parent_B: ['Title 2', 'Orphan 2']

=== Test 3: First element is orphan, assign_orphans=True ===
  None: ['First orphan']
  parent_A: ['Title 1', 'Orphan 1']

=== All tests passed! ===

@badGarnet
Copy link
Collaborator

=== Test 1: assign_orphans=False (default) ===
  parent_A: ['Title 1', 'Child of A']
  None: ['Orphan 1', 'Orphan 2']
  parent_B: ['Title 2']

=== Test 2: assign_orphans=True ===
  parent_A: ['Title 1', 'Child of A', 'Orphan 1']
  parent_B: ['Title 2', 'Orphan 2']

=== Test 3: First element is orphan, assign_orphans=True ===
  None: ['First orphan']
  parent_A: ['Title 1', 'Orphan 1']

=== All tests passed! ===

please add those as unit test

@MkDev11 MkDev11 force-pushed the feat/group-elements-by-parent-id-1489 branch from 70b7856 to fffcc22 Compare January 22, 2026 21:22
@MkDev11
Copy link
Contributor Author

MkDev11 commented Jan 22, 2026

=== Test 1: assign_orphans=False (default) ===
  parent_A: ['Title 1', 'Child of A']
  None: ['Orphan 1', 'Orphan 2']
  parent_B: ['Title 2']

=== Test 2: assign_orphans=True ===
  parent_A: ['Title 1', 'Child of A', 'Orphan 1']
  parent_B: ['Title 2', 'Orphan 2']

=== Test 3: First element is orphan, assign_orphans=True ===
  None: ['First orphan']
  parent_A: ['Title 1', 'Orphan 1']

=== All tests passed! ===

please add those as unit test

added it

@MkDev11 MkDev11 force-pushed the feat/group-elements-by-parent-id-1489 branch 2 times, most recently from 51257b0 to b6e5080 Compare January 22, 2026 23:45
Add a utility function to group elements by their parent_id metadata field.
This allows users to easily traverse document hierarchy by grouping elements
that share the same parent.

Includes an optional 'assign_orphans' parameter that, when True, assigns
elements with no parent_id to the same group as the previous element.

Adds unit tests for the new function.

Fixes Unstructured-IO#1489
@MkDev11 MkDev11 force-pushed the feat/group-elements-by-parent-id-1489 branch from b6e5080 to 5814c6f Compare January 23, 2026 00:34
@MkDev11
Copy link
Contributor Author

MkDev11 commented Jan 23, 2026

@badGarnet could you please have a look at the changes and let me know ther result? always appeciate for your time!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat/group elements by parent_id

2 participants