feat: add Z.Ai model provider #2107

ous50 · 2025-11-20T11:41:20Z

Related Issues or Context

This PR contains Changes to Model-Plugin

Documentation
Other

This PR contains Changes to Non-LLM Models Plugin

I have Run Comprehensive Tests Relevant to My Changes

This PR contains Changes to LLM Models Plugin

My Changes Affect Message Flow Handling (System Messages and User→Assistant Turn-Taking)

My Changes Affect Tool Interaction Flow (Multi-Round Usage and Output Handling, for both Agent App and Agent Node)

My Changes Affect Multimodal Input Handling (Images, PDFs, Audio, Video, etc.)

My Changes Affect Multimodal Output Generation (Images, Audio, Video, etc.)

My Changes Affect Structured Output Format (JSON, XML, etc.)

My Changes Affect Token Consumption Metrics

My Changes Affect Other LLM Functionalities (Reasoning Process, Grounding, Prompt Caching, etc.)

Other Changes (Add New Models, Fix Model Parameters etc.)

Version Control (Any Changes to the Plugin Will Require Bumping the Version)

I have Bumped Up the Version in Manifest.yaml (Top-Level Version Field, Not in Meta Section)

Dify Plugin SDK Version

I have Ensured dify_plugin>=0.3.0,<0.6.0 is in requirements.txt (SDK docs)

Environment Verification (If Any Code Changes)

Local Deployment Environment

Dify Version is: , I have Tested My Changes on Local Deployment Dify with a Clean Environment That Matches the Production Configuration.

SaaS Environment

I have Tested My Changes on cloud.dify.ai with a Clean Environment That Matches the Production Configuration

gemini-code-assist · 2025-11-20T11:41:49Z

Summary of Changes

Hello @ous50, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the Z.Ai model provider, significantly expanding the available large language models within the system. It integrates a comprehensive suite of Z.Ai's LLMs, including vision-enabled models with multimodal input capabilities, and lays the groundwork for text embedding models. The changes also include detailed parameter configurations for these models, ensuring flexible and robust interaction.

Highlights

New Model Provider Integration: The Z.Ai model provider has been integrated, expanding the range of available Large Language Models (LLMs) and laying the groundwork for text embedding models.
Addition of Z.Ai LLM Models: Multiple Z.Ai LLM models are now supported, including glm-4.6, glm-4.5, glm-4.5-x, glm-4.5-air, glm-4.5-airx, glm-4.5-flash, glm-4.5v, and glm-4-32b-0414-128k, each with detailed parameter configurations.
Multimodal Input Support: The integration includes specific handling for multimodal inputs (images and videos) for Z.Ai vision models like glm-4.5v, enhancing their capabilities.
Advanced Model Parameter Handling: New parameters such as web_search, thinking, and response_format are now supported for Z.Ai LLMs, offering more control over model behavior and output.
Text Embedding Models (Currently Disabled): Z.Ai text embedding models (embedding-2, embedding-3, text_embedding) have been added to the codebase, although they are currently commented out and not actively enabled in the plugin's manifest.
Plugin Versioning and Dependencies: The plugin's version has been updated in manifest.yaml, and new Python dependencies (zai-sdk, pydantic) have been added to requirements.txt.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new model provider, Z.Ai. The implementation is comprehensive, covering multiple models and their configurations. However, there are several areas that require attention to improve performance, reliability, and maintainability. Key issues include a potentially misconfigured memory limit, inefficient API usage for embeddings, incomplete error handling, and some inconsistencies in the configuration files. Addressing these points will significantly enhance the quality and robustness of the new provider.

gemini-code-assist · 2025-11-20T11:44:34Z

models/z_ai/manifest.yaml

+  models:
+    - provider/z_ai.yaml
+resource:
+  memory: 4194304


The memory resource limit is set to 4194304 bytes (4MB). This is a very low amount of memory for a Python application that needs to load SDKs and handle API requests. It's highly likely to cause Out Of Memory (OOM) errors, leading to plugin crashes. Please increase this limit to a more reasonable value, for example, 67108864 (64MB) or higher, depending on the plugin's actual memory footprint.

gemini-code-assist · 2025-11-20T11:44:34Z

models/z_ai/models/_common.py

+        return {
+            InvokeConnectionError: [],
+            InvokeServerUnavailableError: [],
+            InvokeRateLimitError: [],
+            InvokeAuthorizationError: [],
+            InvokeBadRequestError: [],
+        }


The _invoke_error_mapping dictionary is empty. This means that specific errors from the zai-sdk (like authentication errors, rate limits, or server errors) will not be mapped to Dify's standardized InvokeError types. Instead, they will be caught as generic exceptions, leading to poor error handling and unhelpful error messages for the user. Please populate this mapping with the appropriate exception types from the zai-sdk.

For example (the actual exception names from zai-sdk might differ):

from zai.error import AuthenticationError, RateLimitError, APIConnectionError, APIStatusError ... return { InvokeConnectionError: [APIConnectionError], InvokeServerUnavailableError: [APIStatusError], InvokeRateLimitError: [RateLimitError], InvokeAuthorizationError: [AuthenticationError], InvokeBadRequestError: [], }

gemini-code-assist · 2025-11-20T11:44:35Z

models/z_ai/models/llm/glm-4.5v.yaml

+    default: true
+  - name: max_tokens
+    use_template: max_tokens
+    default: 16384


The default value for max_tokens is set to 16384, which is the maximum allowed value for this model. This is a risky default as it can lead to unexpectedly high token consumption and costs for users who don't explicitly set a lower value. It's recommended to set a more conservative default, such as 4096.

default: 4096

gemini-code-assist · 2025-11-20T11:44:35Z

models/z_ai/models/llm/llm.py

+        """
+        extra_model_kwargs = {}
+        # request to glm-4v-plus with stop words will always respond "finish_reason":"network_error"
+        if stop and model != "glm-4v-plus":


The code checks for model != "glm-4v-plus" to apply stop words, with a comment indicating this model has issues. However, glm-4v-plus is not a model provided by z_ai according to the configuration files. This appears to be a copy-paste error from another provider. If any of the z_ai vision models (like glm-4.5v) have similar issues with the stop parameter, the check should be updated to use the correct model name. Otherwise, this logic is incorrect and could lead to unexpected behavior.

gemini-code-assist · 2025-11-20T11:44:35Z

models/z_ai/models/text_embedding/text_embedding.py

+        embeddings = []
+        embedding_used_tokens = 0
+        for text in texts:
+            response = client.embeddings.create(model=model, input=text)
+            data = response.data[0]
+            embeddings.append(data.embedding)
+            embedding_used_tokens += response.usage.total_tokens
+        return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)


The embed_documents method iterates through the list of texts and makes a separate API call for each one. This is highly inefficient and will lead to poor performance, especially for a large number of documents. Most embedding APIs, including ZhipuAI's, support batching, where you can pass a list of texts in a single API call. Please refactor this to make a single batch request to improve performance.

Suggested change

embeddings = []

embedding_used_tokens = 0

for text in texts:

response = client.embeddings.create(model=model, input=text)

data = response.data[0]

embeddings.append(data.embedding)

embedding_used_tokens += response.usage.total_tokens

return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)

if not texts:

return [], 0

# Assuming the API supports batching. Please verify with zai-sdk documentation.

response = client.embeddings.create(model=model, input=texts)

embeddings = [data.embedding for data in response.data]

embedding_used_tokens = response.usage.total_tokens

return [list(map(float, e)) for e in embeddings], embedding_used_tokens

gemini-code-assist · 2025-11-20T11:44:35Z

models/z_ai/README.md

+# Configure
+After installation, you need to get API keys from [Z.AI](https://z.ai/manage-apikey/apikey-list) and setup in Settings -> Model Provider.
+
+![](_assets/z.ai.png)


The image _assets/z.ai.png is referenced here, but the file itself has not been added to the repository. This will result in a broken image link in the documentation. Please add the image file to the _assets directory to ensure the documentation renders correctly.

models/z_ai/_assets/icon_s_en.svg

gemini-code-assist · 2025-11-20T11:44:35Z

models/z_ai/manifest.yaml

+    model:
+      enabled: true
+      llm: true
+      text_embedding: true


The permission for text_embedding is set to true, but the text embedding model type and its source files are commented out in provider/z_ai.yaml. This creates an inconsistency. If text embedding is not yet supported, this permission should be set to false to avoid confusion and potential issues with the platform's permission handling.

gemini-code-assist · 2025-11-20T11:44:35Z

models/z_ai/models/llm/glm-4.5-flash.yaml

+  input: '0.00'
+  output: '0.00'
+  cached: '0.00'


The pricing for this model is set to 0.00 for input, output, and cached tokens. If this model is indeed free to use, this is correct. However, if this is a placeholder, it should be updated with the actual pricing information to ensure correct billing and cost estimation for users.

gemini-code-assist · 2025-11-20T11:44:36Z

models/z_ai/requirements.txt

@@ -0,0 +1,3 @@
+dify_plugin<0.6.0,>=0.5.0
+zai-sdk>=0.0.3.1
+pydantic==2.8.2


The pydantic library is pinned to a specific version (==2.8.2). This can lead to dependency conflicts with dify_plugin or other plugins in the ecosystem, which may require a different version of pydantic. It is generally better to specify a compatible range (e.g., pydantic>=2.8.2,<3.0.0) or remove the pin entirely if it's not strictly necessary, allowing the dependency resolver to find a compatible version.

pydantic>=2.8.2,<3.0.0

Copilot

Pull Request Overview

This PR adds support for Z.AI as a new model provider, enabling integration with Z.AI's ChatGLM language models. The implementation includes configuration for multiple GLM model variants (glm-4.6, glm-4.5 series) with support for features like tool calling, vision capabilities, and web search.

Key Changes

Added Z.AI model provider plugin with LLM support for 8 GLM model variants
Implemented text embedding models (though currently commented out in configuration)
Configured model parameters including temperature, top_p, max_tokens, web search, and thinking mode

Reviewed Changes

Copilot reviewed 23 out of 29 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
requirements.txt	Defines dependencies including dify_plugin, zai-sdk, and pydantic
provider/z_ai.yaml	Provider configuration with credential schema and model definitions
provider/z_ai.py	Provider validation logic using glm-4.5-flash for credential testing
models/_common.py	Common utility class for credential transformation and error mapping
models/llm/llm.py	LLM implementation with message handling, streaming, and tool support
models/llm/*.yaml	Configuration files for 8 GLM model variants with pricing and parameters
models/text_embedding/*	Text embedding implementation (currently disabled)
manifest.yaml	Plugin manifest with version 0.0.1 and Python 3.12 configuration
main.py	Plugin entry point
_assets/*	Icon files for the provider
README.md	Documentation for setup and API key configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

models/z_ai/manifest.yaml

Copilot · 2025-11-20T11:52:09Z

models/z_ai/models/llm/llm.py

+            message_text = f"{human_prompt} {content}"
+        elif isinstance(message, AssistantPromptMessage):
+            message_text = f"{ai_prompt} {content}"
+        elif isinstance(message, SystemPromptMessage | ToolPromptMessage):


[nitpick] The isinstance check on line 505 uses a union pattern SystemPromptMessage | ToolPromptMessage which is only supported in Python 3.10+. However, the manifest.yaml specifies Python 3.12, so this is acceptable. For broader compatibility, consider using isinstance(message, (SystemPromptMessage, ToolPromptMessage)) tuple syntax instead.

Suggested change

elif isinstance(message, SystemPromptMessage | ToolPromptMessage):

elif isinstance(message, (SystemPromptMessage, ToolPromptMessage)):

models/z_ai/models/text_embedding/text_embedding.py

crazywoola · 2025-11-21T03:36:20Z

It seems the latest commit is not related to this PR

ous50 · 2025-11-22T17:45:00Z

It seems the latest commit is not related to this PR

May I confirm which commits were you referring to? I rebased upstream while making new changes.

In case of any further misleading problems, I dropped this commit.

alterxyz · 2025-11-22T19:33:08Z

LGTM now, all commit and modified files is now under z_ai folder.

crazywoola · 2025-11-24T06:29:36Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a new model provider for Z.AI, including support for several large language models and vision models, with initial code for text embedding. The implementation is well-organized, but I have identified several areas for improvement. Key issues include a broken image link in the documentation, an inconsistency in the manifest file regarding text embedding permissions, and an unimplemented error mapping that could lead to unhandled exceptions. Additionally, there are some minor typos, an overly broad exception catch, and a significant performance issue in the text embedding implementation where texts are not batched for API calls. Addressing these points will improve the robustness and quality of the new provider.

gemini-code-assist · 2025-11-24T06:31:27Z

models/z_ai/models/_common.py

+    def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
+        """
+        Map model invoke error to unified error
+        The key is the error type thrown to the caller
+        The value is the error type thrown by the model,
+        which needs to be converted into a unified error type for the caller.
+
+        :return: Invoke error mapping
+        """
+        return {
+            InvokeConnectionError: [],
+            InvokeServerUnavailableError: [],
+            InvokeRateLimitError: [],
+            InvokeAuthorizationError: [],
+            InvokeBadRequestError: [],
+        }


The _invoke_error_mapping dictionary is empty. This is intended to map provider-specific exceptions to the framework's unified InvokeError types. Without this mapping, exceptions from the zai-sdk will not be handled gracefully by the framework, potentially causing crashes. Please populate this mapping with the relevant exceptions from the zai-sdk. For example, you should map zai.error.APIStatusError to the appropriate InvokeError subclasses.

gemini-code-assist · 2025-11-24T06:31:28Z

models/z_ai/models/text_embedding/text_embedding.py

+        embeddings = []
+        embedding_used_tokens = 0
+        for text in texts:
+            response = client.embeddings.create(model=model, input=text)
+            data = response.data[0]
+            embeddings.append(data.embedding)
+            embedding_used_tokens += response.usage.total_tokens
+        return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)


The embed_documents method makes a separate API call for each text in a loop. This is inefficient and can lead to poor performance and rate limiting issues. The ZhipuAI embeddings API supports batching multiple texts in a single request. Please refactor this to send all texts in a single API call.

Suggested change

embeddings = []

embedding_used_tokens = 0

for text in texts:

response = client.embeddings.create(model=model, input=text)

data = response.data[0]

embeddings.append(data.embedding)

embedding_used_tokens += response.usage.total_tokens

return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)

if not texts:

return [], 0

response = client.embeddings.create(model=model, input=texts)

embeddings = [data.embedding for data in response.data]

embedding_used_tokens = response.usage.total_tokens

return [list(map(float, e)) for e in embeddings], embedding_used_tokens

gemini-code-assist · 2025-11-24T06:31:28Z

models/z_ai/README.md

+# Configure
+After installation, you need to get API keys from [Z.AI](https://z.ai/manage-apikey/apikey-list) and setup in Settings -> Model Provider.
+
+![](_assets/z.ai.png)


The image _assets/z.ai.png referenced here is not included in the pull request, which will result in a broken image in the rendered README. Please add the image file to the _assets directory.

gemini-code-assist · 2025-11-24T06:31:28Z

models/z_ai/manifest.yaml

+    model:
+      enabled: true
+      llm: true
+      text_embedding: true


The permission for text_embedding is enabled (true), but the text embedding models and provider sources are commented out in provider/z_ai.yaml. This is inconsistent. If text embedding is not yet supported, this permission should be set to false.

text_embedding: false

gemini-code-assist · 2025-11-24T06:31:28Z

models/z_ai/models/llm/llm.py

+
+from .._common import _CommonZhipuaiAI
+
+viso_models = [


The variable viso_models appears to have a typo. It should probably be vision_models to better reflect its purpose of listing vision-capable models.

Suggested change

viso_models = [

vision_models = [

models/z_ai/models/llm/llm.py

gemini-code-assist · 2025-11-24T06:31:28Z

models/z_ai/models/llm/llm.py

+                    )
+                try:
+                    schema = json.loads(json_schema)
+                except Exception:


Catching a generic Exception is too broad and can hide other unexpected issues. It's better to catch the more specific json.JSONDecodeError that can be raised by json.loads.

Suggested change

except Exception:

except json.JSONDecodeError:

Copilot

Pull request overview

Copilot reviewed 23 out of 29 changed files in this pull request and generated 11 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-24T06:34:01Z

models/z_ai/provider/z_ai.py

+        except CredentialsValidateFailedError as ex:
+            raise ex


The error handling re-raises exceptions without adding context. Lines 21-25 catch exceptions just to re-raise them, which adds no value. Either add meaningful error context/logging or remove this redundant try-except block.

Suggested change

except CredentialsValidateFailedError as ex:

raise ex

models/z_ai/models/llm/llm.py

models/z_ai/provider/z_ai.py

Copilot · 2025-11-24T06:34:02Z

models/z_ai/requirements.txt

@@ -0,0 +1,3 @@
+dify_plugin<0.6.0,>=0.5.0
+zai-sdk>=0.0.3.1
+pydantic==2.8.2


Pinning pydantic==2.8.2 to a specific version can cause dependency conflicts with other packages. Consider using a version range like pydantic>=2.8.2,<3.0.0 to allow for compatible updates while maintaining stability.

Suggested change

pydantic==2.8.2

pydantic>=2.8.2,<3.0.0

models/z_ai/models/llm/llm.py

models/z_ai/models/_common.py

models/z_ai/models/llm/llm.py

Copilot · 2025-11-24T06:34:03Z

models/z_ai/models/_common.py

+    def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
+        """
+        Map model invoke error to unified error
+        The key is the error type thrown to the caller
+        The value is the error type thrown by the model,
+        which needs to be converted into a unified error type for the caller.
+
+        :return: Invoke error mapping
+        """
+        return {
+            InvokeConnectionError: [],
+            InvokeServerUnavailableError: [],
+            InvokeRateLimitError: [],
+            InvokeAuthorizationError: [],
+            InvokeBadRequestError: [],
+        }


The _invoke_error_mapping property returns empty lists for all error types. This means no SDK-specific exceptions are being mapped to Dify's unified error types, which could result in poor error handling and unclear error messages to users when the API fails.

models/z_ai/models/llm/glm-4.6.yaml

models/z_ai/models/llm/llm.py

feat: add Z.Ai model provider

7df6fee

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 20, 2025

dosubot bot added the enhancement New feature or request label Nov 20, 2025

gemini-code-assist bot reviewed Nov 20, 2025

View reviewed changes

alterxyz assigned crazywoola Nov 20, 2025

alterxyz requested a review from Copilot November 20, 2025 11:48

Copilot started reviewing on behalf of alterxyz November 20, 2025 11:49 View session

Copilot finished reviewing on behalf of alterxyz November 20, 2025 11:50

Copilot AI reviewed Nov 20, 2025

View reviewed changes

fix: Z.AI manifest and class description typo

3ea8b6f

ous50 force-pushed the main branch from a7ee2fc to 3ea8b6f Compare November 22, 2025 17:42

crazywoola requested review from alterxyz and Copilot November 24, 2025 06:29

Copilot started reviewing on behalf of crazywoola November 24, 2025 06:29 View session

gemini-code-assist bot reviewed Nov 24, 2025

View reviewed changes

Copilot finished reviewing on behalf of crazywoola November 24, 2025 06:32

Copilot AI reviewed Nov 24, 2025

View reviewed changes

ous50 added 2 commits December 7, 2025 11:05

fix:change api client endpoint and remove glm-4v-plus specific code

c74538f

feat: add text embedding after testing

0d33548

	elif isinstance(message, SystemPromptMessage \| ToolPromptMessage):
	elif isinstance(message, (SystemPromptMessage, ToolPromptMessage)):

feat: add Z.Ai model provider #2107

Are you sure you want to change the base?

feat: add Z.Ai model provider #2107

Uh oh!

Conversation

ous50 commented Nov 20, 2025

Related Issues or Context

This PR contains Changes to Model-Plugin

This PR contains Changes to Non-LLM Models Plugin

This PR contains Changes to LLM Models Plugin

Version Control (Any Changes to the Plugin Will Require Bumping the Version)

Dify Plugin SDK Version

Environment Verification (If Any Code Changes)

Local Deployment Environment

SaaS Environment

Uh oh!

gemini-code-assist bot commented Nov 20, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Key Changes

Reviewed Changes

Uh oh!

Uh oh!

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

crazywoola commented Nov 21, 2025

Uh oh!

ous50 commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alterxyz commented Nov 22, 2025

Uh oh!

crazywoola commented Nov 24, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 24, 2025

Choose a reason for hiding this comment

ous50 commented Nov 22, 2025 •

edited

Loading