fix: partially support image-text-to-text models on HuggingFaceLocalChatGenerator by srini047 · Pull Request #10928 · deepset-ai/haystack

srini047 · 2026-03-26T06:57:09Z

Related Issues

fixes Add image-text-to-text task support to HuggingFaceLocalChatGenerator #10926

Proposed Changes:

Currently haystack doesn't support image-text-to-text generation. Hence, adding support for the same.

How did you test it?

Add test cases and mocked those by validating it.

Notes to the reviewer

If you want me to revert the changes for test_init_text2text_generation_raises_error let me know.

Checklist

I have read the contributors guidelines and the code of conduct.
I have updated the related issue with new insights and changes.
I have added unit tests and updated the docstrings.
I've used one of the conventional commit types for my PR title: fix:, feat:, build:, chore:, ci:, docs:, style:, refactor:, perf:, test: and added ! in case the PR includes breaking changes.
I have documented my code.
I have added a release note file, following the contributors guidelines.
I have run pre-commit hooks and fixed any issue.

vercel · 2026-03-26T06:57:15Z

@srini047 is attempting to deploy a commit to the deepset Team on Vercel.

A member of the Team first needs to authorize it.

anakin87

Thank you for the contribution!

See #10926 (comment)

Your fix looks good.

I left a few comments.

anakin87 · 2026-03-26T16:42:34Z

haystack/components/generators/chat/hugging_face_local.py

            - `text-generation`: Supported by decoder models, like GPT.
            - `text2text-generation`: Deprecated as of Transformers v5; use `text-generation` instead.
              Previously supported by encoder–decoder models such as T5.
+            - `image-text-to-text`: Supported by vision-language models, like Qwen2-VL.


Suggested change

- `image-text-to-text`: Supported by vision-language models, like Qwen2-VL.

- `image-text-to-text`: Supported by vision-language models.

This would need less updates in the future

anakin87 · 2026-03-26T16:43:26Z

test/components/generators/chat/test_hugging_face_local.py

+    @patch("haystack.components.generators.chat.hugging_face_local.transformers")
+    def test_init_text2text_generation_raises_error(self, version_mock):
+        version_mock.__version__ = "5.0.0"


should not be needed since we run tests with >=5.0.0

anakin87 · 2026-03-26T16:46:45Z

test/components/generators/chat/test_hugging_face_local.py

+    @patch("haystack.components.generators.chat.hugging_face_local.pipeline")
+    def test_run_image_text_to_text(self, pipeline_init_mock, model_info_mock, chat_messages):


1.test_run_image_text_to_text test can be removed
2. I'd add a very simple test like this: if the initialization is successful, we are OK (it's currently failing on main)

llm = HuggingFaceLocalChatGenerator(model="Qwen/Qwen3.5-0.8B") assert llm

srini047 requested a review from a team as a code owner March 26, 2026 06:57

srini047 requested review from anakin87 and removed request for a team March 26, 2026 06:57

github-actions bot added topic:tests type:documentation Improvements on the docs labels Mar 26, 2026

fix: add support for image-text-to-text generation for hf

4d345be

srini047 force-pushed the hf-image-text-to-text branch from 8030e14 to 4d345be Compare March 26, 2026 07:00

srini047 changed the title ~~fix: add support for image-text-to-text generation for hf~~ feat: add support for image-text-to-text generation for hf Mar 26, 2026

anakin87 changed the title ~~feat: add support for image-text-to-text generation for hf~~ fix: partially support image-text-to-text models on HuggingFaceLocalChatGenerator Mar 26, 2026

anakin87 requested changes Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: partially support image-text-to-text models on HuggingFaceLocalChatGenerator#10928

fix: partially support image-text-to-text models on HuggingFaceLocalChatGenerator#10928
srini047 wants to merge 1 commit intodeepset-ai:mainfrom
srini047:hf-image-text-to-text

srini047 commented Mar 26, 2026 •

edited

Loading

Uh oh!

vercel bot commented Mar 26, 2026

Uh oh!

anakin87 left a comment

Uh oh!

anakin87 Mar 26, 2026

Uh oh!

anakin87 Mar 26, 2026 •

edited

Loading

Uh oh!

anakin87 Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	- `image-text-to-text`: Supported by vision-language models, like Qwen2-VL.
	- `image-text-to-text`: Supported by vision-language models.

		@patch("haystack.components.generators.chat.hugging_face_local.pipeline")
		def test_run_image_text_to_text(self, pipeline_init_mock, model_info_mock, chat_messages):

Conversation

srini047 commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Related Issues

Proposed Changes:

How did you test it?

Notes to the reviewer

Checklist

Uh oh!

vercel bot commented Mar 26, 2026

Uh oh!

anakin87 left a comment

Choose a reason for hiding this comment

Uh oh!

anakin87 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

anakin87 Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

anakin87 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

srini047 commented Mar 26, 2026 •

edited

Loading

anakin87 Mar 26, 2026 •

edited

Loading