Fix/521 image collection placeholders by matthewYang08 · Pull Request #917 · Pipelex/pipelex

matthewYang08 · 2026-05-18T14:32:31Z

🔗 Related Issues

#521

📝 Description

🔄 Type of Change

🧪 Tests

📋 Checklist

Linting / type-checking / unit tests pass locally with my changes
I've added tests that prove my fix is effective or that my feature works
My code follows the style guidelines of this project
I've made corresponding changes to the documentation
My changes generate no new warnings

Summary by cubic

Hide field names for image-only collections in prompt rendering to reduce noise, while keeping inner keys for dict image groups and labels for single-image fields. Extend nested image discovery to handle dict containers. Addresses Linear 521.

Bug Fixes
- Rendering: omit outer field names for image-only collections (list, tuple, dict, ListContent); keep dict item keys for context; keep field names when mixed with non-images.
- Discovery: treat dict[...] as a generic container when searching for nested image fields.
- Tests: added coverage for dicts, tuples, and mixed dicts; updated expected prompt output.

^{Written for commit d632696. Summary will update on new commits. Review in cubic}

greptile-apps · 2026-05-18T14:34:24Z

Greptile Summary

This PR updates image collection handling in structured prompt rendering. The main changes are:

Adds dict support to nested image field discovery.
Omits labels for image-only list, tuple, and dict rendering.
Adds tests for dict and tuple image placeholders.

Confidence Score: 3/5

These issues should be fixed before merging.

Dict key types can be mistaken for extractable prompt images.
Optional image types inside containers can be missed by nested image discovery.
Some image collection prompts can lose role or field context.
pipelex/core/stuffs/image_field_search.py
pipelex/core/stuffs/structured_content.py

Important Files Changed

Filename	Overview
pipelex/core/stuffs/image_field_search.py	Adds dict traversal to image field detection, but key types and nested unions need stricter handling.
pipelex/core/stuffs/structured_content.py	Changes prompt rendering for image-only collections, with label-loss and ListContent consistency issues.

Comments Outside Diff (3)

pipelex/core/stuffs/image_field_search.py, line 147-157 (link)

Dict keys misclassified

This now scans every dict type argument, including the key type. A field like dict[ImageContent, str] gets reported as an image field even though the rendered/extracted prompt images can only come from values. That can make template analysis classify obj.map as an image reference, and direct extraction later asks working memory for an ImageContent where the actual attribute is a dict.

Prompt To Fix With AI

This is a comment left during a code review.
Path: pipelex/core/stuffs/image_field_search.py
Line: 147-157

Comment:
**Dict keys misclassified**

This now scans every `dict` type argument, including the key type. A field like `dict[ImageContent, str]` gets reported as an image field even though the rendered/extracted prompt images can only come from values. That can make template analysis classify `obj.map` as an image reference, and direct extraction later asks working memory for an `ImageContent` where the actual attribute is a dict.

How can I resolve this? If you propose a fix, please make it concise.

pipelex/core/stuffs/image_field_search.py, line 147-153 (link)

Nested optionals are skipped

Top-level optional fields are unwrapped before this helper runs, but optional image types inside containers are not. For list[ImageContent | None] or dict[str, ImageContent | None], the container arg is a union, so this branch neither recurses nor checks ImageContent. Those valid image collections are missed, which can make | with_images validation reject a structure that does contain nested images.

Prompt To Fix With AI

This is a comment left during a code review.
Path: pipelex/core/stuffs/image_field_search.py
Line: 147-153

Comment:
**Nested optionals are skipped**

Top-level optional fields are unwrapped before this helper runs, but optional image types inside containers are not. For `list[ImageContent | None]` or `dict[str, ImageContent | None]`, the container arg is a union, so this branch neither recurses nor checks `ImageContent`. Those valid image collections are missed, which can make `| with_images` validation reject a structure that does contain nested images.

How can I resolve this? If you propose a fix, please make it concise.

pipelex/core/stuffs/structured_content.py, line 172-173 (link)

ListContent bypasses detection

ListContent implements ImageRenderable, so this branch returns before _is_image_collection_value() can inspect its items. A StructuredContent field containing ListContent[ImageContent] still renders with the field label, while an equivalent plain list[ImageContent] now renders without it. This makes equivalent image collections produce different prompt text.

Prompt To Fix With AI

This is a comment left during a code review.
Path: pipelex/core/stuffs/structured_content.py
Line: 172-173

Comment:
**ListContent bypasses detection**

`ListContent` implements `ImageRenderable`, so this branch returns before `_is_image_collection_value()` can inspect its items. A `StructuredContent` field containing `ListContent[ImageContent]` still renders with the field label, while an equivalent plain `list[ImageContent]` now renders without it. This makes equivalent image collections produce different prompt text.

How can I resolve this? If you propose a fix, please make it concise.

Prompt To Fix All With AI

Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
pipelex/core/stuffs/image_field_search.py:147-157
**Dict keys misclassified**

This now scans every `dict` type argument, including the key type. A field like `dict[ImageContent, str]` gets reported as an image field even though the rendered/extracted prompt images can only come from values. That can make template analysis classify `obj.map` as an image reference, and direct extraction later asks working memory for an `ImageContent` where the actual attribute is a dict.

### Issue 2 of 4
pipelex/core/stuffs/image_field_search.py:147-153
**Nested optionals are skipped**

Top-level optional fields are unwrapped before this helper runs, but optional image types inside containers are not. For `list[ImageContent | None]` or `dict[str, ImageContent | None]`, the container arg is a union, so this branch neither recurses nor checks `ImageContent`. Those valid image collections are missed, which can make `| with_images` validation reject a structure that does contain nested images.

### Issue 3 of 4
pipelex/core/stuffs/structured_content.py:148-151
**Image labels are dropped**

For image-only collections this drops the outer field name entirely, and the dict branch below drops keys as well. A structure like `photos_by_role={"before": img1, "after": img2}` now renders only bare `[Image N]` tokens, so the prompt no longer tells the model which image is which. That loses semantic context for common image comparison and gallery prompts.

### Issue 4 of 4
pipelex/core/stuffs/structured_content.py:172-173
**ListContent bypasses detection**

`ListContent` implements `ImageRenderable`, so this branch returns before `_is_image_collection_value()` can inspect its items. A `StructuredContent` field containing `ListContent[ImageContent]` still renders with the field label, while an equivalent plain `list[ImageContent]` now renders without it. This makes equivalent image collections produce different prompt text.

_{Reviews (1): Last reviewed commit: "Merge branch 'dev' into fix/521-image-co..." | Re-trigger Greptile}

matthewYang08 · 2026-05-18T14:34:30Z

Hi maintainers, this PR is ready for CI.
Could you please approve workflows so checks can run? Thanks.

greptile-apps · 2026-05-18T14:34:35Z

+                if self._is_image_collection_value(field_value):
+                    parts.append(rendered)
+                else:
+                    parts.append(f"{field_name}: {rendered}")


Image labels are dropped

For image-only collections this drops the outer field name entirely, and the dict branch below drops keys as well. A structure like photos_by_role={"before": img1, "after": img2} now renders only bare [Image N] tokens, so the prompt no longer tells the model which image is which. That loses semantic context for common image comparison and gallery prompts.

Prompt To Fix With AI

This is a comment left during a code review. Path: pipelex/core/stuffs/structured_content.py Line: 148-151 Comment: **Image labels are dropped** For image-only collections this drops the outer field name entirely, and the dict branch below drops keys as well. A structure like `photos_by_role={"before": img1, "after": img2}` now renders only bare `[Image N]` tokens, so the prompt no longer tells the model which image is which. That loses semantic context for common image comparison and gallery prompts. How can I resolve this? If you propose a fix, please make it concise.

cubic-dev-ai

No issues found across 6 files

_{Re-trigger cubic}

thomashebrard · 2026-05-25T13:40:18Z

-                # Check if this container or its nested contents have images
+            # Check if it's a generic container type (list/tuple/dict).
+            # Example: list[ImageContent], tuple[ImageContent, ...], dict[str, ImageContent].
+            if hasattr(field_specific_type, "__origin__") and field_specific_type.__origin__ in {list, tuple, dict}:  # type: ignore[union-attr]


Now that dict is included here, check_generic_container_for_images iterates dict.__args__ indiscriminately, both the key type and value type. So a field typed dict[ImageContent, str] would be reported as
an image field via the key type, and downstream template analysis would treat it as an image reference.

In practice this is rare (Pipelex dicts are essentially always dict[str, ...]), but since we're adding the dict branch, let's make it robust. Could you tighten check_generic_container_for_images so that when
container_type.__origin__ is dict, it only recurses into the value type, not the key?

and add a test

thomashebrard · 2026-05-25T13:41:32Z

+        # Check if arg_type is itself a generic container - recurse!
+        if hasattr(arg_type, "__origin__") and arg_type.__origin__ in {list, tuple, dict}:  # type: ignore[union-attr]
            if check_generic_container_for_images(arg_type):


Top-level optionals are unwrapped before this helper runs, but optionals inside containers aren't. For list[ImageContent | None] or dict[str, ImageContent | None], the inner arg_type is a Union /
UnionType. Its __origin__ is typing.Union, which isn't in {list, tuple, dict}, and isinstance(arg_type, type) is False — so the loop falls through and the collection is missed. | with_images
validation would then reject a perfectly valid structure.

This is technically pre-existing (the same gap exists for list[ImageContent | None] before your change), so if you'd rather scope this PR tightly, we can land it separately. But it'd be a clean addition here:
when an arg_type is a Union/UnionType, iterate its non-None members and run the same checks on each.

thomashebrard · 2026-05-25T13:41:51Z

Hey @matthewYang08 , thanks for your contribution and the fix, its great! but there is something missing:

Issue #521 specifically names ListContent[ImageContent] as the broken scenario, and your new _is_image_collection_value helper has a dedicated
ListContent branch — but no test exercises it. Could you add a test in tests/unit/pipelex/core/stuffs/test_structured_content_render_with_images.py covering a StructuredContent field of type

thomashebrard · 2026-05-25T13:42:19Z

You can also merge dev inside your branch, it is behind

matthewYang08 and others added 3 commits May 18, 2026 22:13

llm: omit image collection names in prompt placeholders

b05c032

llm: detect dict image collections in nested image discovery

c8a3732

Merge branch 'dev' into fix/521-image-collection-placeholders

21bcb67

greptile-apps Bot reviewed May 18, 2026

View reviewed changes

cubic-dev-ai Bot reviewed May 18, 2026

View reviewed changes

llm: preserve semantic keys for dict image collections

d632696

thomashebrard reviewed May 25, 2026

View reviewed changes

thomashebrard mentioned this pull request May 25, 2026

llm: omit image collection names in prompt placeholders #916

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/521 image collection placeholders#917

Fix/521 image collection placeholders#917
matthewYang08 wants to merge 4 commits into
Pipelex:devfrom
matthewYang08:fix/521-image-collection-placeholders

matthewYang08 commented May 18, 2026 •

edited by thomashebrard

Loading

Uh oh!

greptile-apps Bot commented May 18, 2026 •

edited

Loading

Comments Outside Diff (3)

Uh oh!

matthewYang08 commented May 18, 2026

Uh oh!

greptile-apps Bot May 18, 2026

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

thomashebrard May 25, 2026

Uh oh!

thomashebrard May 25, 2026

Uh oh!

thomashebrard commented May 25, 2026 •

edited

Loading

Uh oh!

thomashebrard commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

matthewYang08 commented May 18, 2026 • edited by thomashebrard Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Related Issues

📝 Description

🔄 Type of Change

🧪 Tests

📋 Checklist

Summary by cubic

Uh oh!

greptile-apps Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Comments Outside Diff (3)

Uh oh!

matthewYang08 commented May 18, 2026

Uh oh!

greptile-apps Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

thomashebrard May 25, 2026

Choose a reason for hiding this comment

Uh oh!

thomashebrard May 25, 2026

Choose a reason for hiding this comment

Uh oh!

thomashebrard commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomashebrard commented May 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

matthewYang08 commented May 18, 2026 •

edited by thomashebrard

Loading

greptile-apps Bot commented May 18, 2026 •

edited

Loading

thomashebrard commented May 25, 2026 •

edited

Loading