Skip to content

llm: omit image collection names in prompt placeholders#916

Closed
matthewYang08 wants to merge 2 commits into
Pipelex:mainfrom
matthewYang08:fix-521-image-collection-placeholders
Closed

llm: omit image collection names in prompt placeholders#916
matthewYang08 wants to merge 2 commits into
Pipelex:mainfrom
matthewYang08:fix-521-image-collection-placeholders

Conversation

@matthewYang08
Copy link
Copy Markdown

@matthewYang08 matthewYang08 commented May 18, 2026

Summary

This PR addresses #521 by omitting collection names when rendering image collections into prompt placeholders.

What changed

  • Updated StructuredContent.render_with_images() behavior:
    • Omit field/key names only for image collections (lists/tuples/dicts of images).
    • Keep semantic keys for mixed-content dicts.
    • Keep non-collection image field labels (e.g. page_view) unchanged.
  • Added regression tests for:
    • Plain image list rendering without collection name
    • Dict image collection rendering without keys
    • Tuple image collection behavior
    • Mixed dict behavior preserving semantic keys
    • Prompt extraction expectation for nested with_images

Validation

  • pytest tests/unit/pipelex/core/stuffs/test_structured_content_render_with_images.py
  • pytest tests/integration/pipelex/pipes/llm_prompt_inputs/test_prompt_image_extraction.py --disable-inference
  • ruff check on changed files
  • pyright on changed files

Refs #521


Summary by cubic

Updates prompt rendering to omit names for image-only collections (lists, tuples, dicts) and keep labels for single-image fields, reducing noise and aligning with image extraction (addresses #521). Also extends nested image discovery to detect dict-based image collections.

  • Bug Fixes
    • Updated StructuredContent.render_with_images() to hide names for image-only collections; preserve names for mixed dicts and non-collection fields.
    • Improved image-only detection across nested sequences, dicts, and ListContent via _is_image_collection_value() and _is_image_only_value().
    • Extended nested image discovery to include dict generics in search_for_nested_image_fields()/check_generic_container_for_images(); added tests for dict containers, tuples, nested lists, mixed dicts, and label preservation in prompt extraction.

Written for commit c8a3732. Summary will update on new commits. Review in cubic

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 18, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@matthewYang08
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 18, 2026

Greptile Summary

This PR changes how image collections are rendered in prompt placeholders. The main changes are:

  • Omits field names for image-only lists, tuples, and dicts.
  • Keeps labels for single image fields such as page_view.
  • Keeps semantic keys for mixed dicts with image and non-image values.
  • Adds regression coverage for collection rendering and nested prompt extraction.

Confidence Score: 4/5

This is close, but the dict image collection path should be fixed before merging.

  • Dict image collections can render in StructuredContent but can still be rejected by prompt validation.

  • Existing image-field discovery does not appear to recognize dict containers.

  • List, tuple, mixed dict, and single-image behavior look covered by the changed tests.

  • pipelex/core/stuffs/image_field_search.py should be updated alongside the new dict rendering behavior.

Important Files Changed

Filename Overview
pipelex/core/stuffs/structured_content.py Adds image-only collection detection and suppresses labels for those rendered collections.
tests/unit/pipelex/core/stuffs/test_structured_content_render_with_images.py Adds unit coverage for image-only lists, dicts, tuples, nested lists, and mixed dicts.
tests/integration/pipelex/pipes/llm_prompt_inputs/test_prompt_image_extraction.py Adds coverage that single nested image fields still keep their label.
Prompt To Fix All With AI
Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
pipelex/core/stuffs/structured_content.py:203-205
**Dict fields stay undiscoverable** The renderer now treats `dict[str, ImageContent]` as an image collection, but the image-field discovery path used by prompt validation still only recognizes list and tuple containers. A structured concept whose only images are in a dict field can render correctly here, but `{{ gallery | with_images }}` is rejected earlier as having no nested images, so this new dict support is unreachable in normal prompt flows.

Reviews (1): Last reviewed commit: "llm: omit image collection names in prom..." | Re-trigger Greptile

@matthewYang08
Copy link
Copy Markdown
Author

感谢你的投稿,我们非常感激。像许多开源项目一样,我们要求您签署贡献者许可协议,才能接受您的贡献。你只需发布一则拉取请求评论,格式与以下相同即可签署CLA。

我已阅读CLA文件,特此签署CLA

您可以通过在此拉取请求中评论recheck来重新触发该机器人。由CLA助理轻量机器人发布。

recheck

Comment on lines +203 to +205
if isinstance(value, dict):
dict_value = cast("dict[str, Any]", value)
return bool(dict_value) and all(self._is_image_only_value(item) for item in dict_value.values())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Dict fields stay undiscoverable The renderer now treats dict[str, ImageContent] as an image collection, but the image-field discovery path used by prompt validation still only recognizes list and tuple containers. A structured concept whose only images are in a dict field can render correctly here, but {{ gallery | with_images }} is rejected earlier as having no nested images, so this new dict support is unreachable in normal prompt flows.

Prompt To Fix With AI
This is a comment left during a code review.
Path: pipelex/core/stuffs/structured_content.py
Line: 203-205

Comment:
**Dict fields stay undiscoverable** The renderer now treats `dict[str, ImageContent]` as an image collection, but the image-field discovery path used by prompt validation still only recognizes list and tuple containers. A structured concept whose only images are in a dict field can render correctly here, but `{{ gallery | with_images }}` is rejected earlier as having no nested images, so this new dict support is unreachable in normal prompt flows.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 3 files

Re-trigger cubic

@github-actions github-actions Bot locked and limited conversation to collaborators May 25, 2026
@thomashebrard
Copy link
Copy Markdown
Member

Closed because it was pointing on main: see #917

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants