-
Notifications
You must be signed in to change notification settings - Fork 683
feat: add Z.Ai model provider #2107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @ous50, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces the Z.Ai model provider, significantly expanding the available large language models within the system. It integrates a comprehensive suite of Z.Ai's LLMs, including vision-enabled models with multimodal input capabilities, and lays the groundwork for text embedding models. The changes also include detailed parameter configurations for these models, ensuring flexible and robust interaction. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new model provider, Z.Ai. The implementation is comprehensive, covering multiple models and their configurations. However, there are several areas that require attention to improve performance, reliability, and maintainability. Key issues include a potentially misconfigured memory limit, inefficient API usage for embeddings, incomplete error handling, and some inconsistencies in the configuration files. Addressing these points will significantly enhance the quality and robustness of the new provider.
| models: | ||
| - provider/z_ai.yaml | ||
| resource: | ||
| memory: 4194304 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory resource limit is set to 4194304 bytes (4MB). This is a very low amount of memory for a Python application that needs to load SDKs and handle API requests. It's highly likely to cause Out Of Memory (OOM) errors, leading to plugin crashes. Please increase this limit to a more reasonable value, for example, 67108864 (64MB) or higher, depending on the plugin's actual memory footprint.
| return { | ||
| InvokeConnectionError: [], | ||
| InvokeServerUnavailableError: [], | ||
| InvokeRateLimitError: [], | ||
| InvokeAuthorizationError: [], | ||
| InvokeBadRequestError: [], | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _invoke_error_mapping dictionary is empty. This means that specific errors from the zai-sdk (like authentication errors, rate limits, or server errors) will not be mapped to Dify's standardized InvokeError types. Instead, they will be caught as generic exceptions, leading to poor error handling and unhelpful error messages for the user. Please populate this mapping with the appropriate exception types from the zai-sdk.
For example (the actual exception names from zai-sdk might differ):
from zai.error import AuthenticationError, RateLimitError, APIConnectionError, APIStatusError
...
return {
InvokeConnectionError: [APIConnectionError],
InvokeServerUnavailableError: [APIStatusError],
InvokeRateLimitError: [RateLimitError],
InvokeAuthorizationError: [AuthenticationError],
InvokeBadRequestError: [],
}| default: true | ||
| - name: max_tokens | ||
| use_template: max_tokens | ||
| default: 16384 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value for max_tokens is set to 16384, which is the maximum allowed value for this model. This is a risky default as it can lead to unexpectedly high token consumption and costs for users who don't explicitly set a lower value. It's recommended to set a more conservative default, such as 4096.
default: 4096
models/z_ai/models/llm/llm.py
Outdated
| """ | ||
| extra_model_kwargs = {} | ||
| # request to glm-4v-plus with stop words will always respond "finish_reason":"network_error" | ||
| if stop and model != "glm-4v-plus": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code checks for model != "glm-4v-plus" to apply stop words, with a comment indicating this model has issues. However, glm-4v-plus is not a model provided by z_ai according to the configuration files. This appears to be a copy-paste error from another provider. If any of the z_ai vision models (like glm-4.5v) have similar issues with the stop parameter, the check should be updated to use the correct model name. Otherwise, this logic is incorrect and could lead to unexpected behavior.
| embeddings = [] | ||
| embedding_used_tokens = 0 | ||
| for text in texts: | ||
| response = client.embeddings.create(model=model, input=text) | ||
| data = response.data[0] | ||
| embeddings.append(data.embedding) | ||
| embedding_used_tokens += response.usage.total_tokens | ||
| return ([list(map(float, e)) for e in embeddings], embedding_used_tokens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The embed_documents method iterates through the list of texts and makes a separate API call for each one. This is highly inefficient and will lead to poor performance, especially for a large number of documents. Most embedding APIs, including ZhipuAI's, support batching, where you can pass a list of texts in a single API call. Please refactor this to make a single batch request to improve performance.
| embeddings = [] | |
| embedding_used_tokens = 0 | |
| for text in texts: | |
| response = client.embeddings.create(model=model, input=text) | |
| data = response.data[0] | |
| embeddings.append(data.embedding) | |
| embedding_used_tokens += response.usage.total_tokens | |
| return ([list(map(float, e)) for e in embeddings], embedding_used_tokens) | |
| if not texts: | |
| return [], 0 | |
| # Assuming the API supports batching. Please verify with zai-sdk documentation. | |
| response = client.embeddings.create(model=model, input=texts) | |
| embeddings = [data.embedding for data in response.data] | |
| embedding_used_tokens = response.usage.total_tokens | |
| return [list(map(float, e)) for e in embeddings], embedding_used_tokens |
| # Configure | ||
| After installation, you need to get API keys from [Z.AI](https://z.ai/manage-apikey/apikey-list) and setup in Settings -> Model Provider. | ||
|
|
||
|  No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| model: | ||
| enabled: true | ||
| llm: true | ||
| text_embedding: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The permission for text_embedding is set to true, but the text embedding model type and its source files are commented out in provider/z_ai.yaml. This creates an inconsistency. If text embedding is not yet supported, this permission should be set to false to avoid confusion and potential issues with the platform's permission handling.
| input: '0.00' | ||
| output: '0.00' | ||
| cached: '0.00' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| @@ -0,0 +1,3 @@ | |||
| dify_plugin<0.6.0,>=0.5.0 | |||
| zai-sdk>=0.0.3.1 | |||
| pydantic==2.8.2 | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pydantic library is pinned to a specific version (==2.8.2). This can lead to dependency conflicts with dify_plugin or other plugins in the ecosystem, which may require a different version of pydantic. It is generally better to specify a compatible range (e.g., pydantic>=2.8.2,<3.0.0) or remove the pin entirely if it's not strictly necessary, allowing the dependency resolver to find a compatible version.
pydantic>=2.8.2,<3.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for Z.AI as a new model provider, enabling integration with Z.AI's ChatGLM language models. The implementation includes configuration for multiple GLM model variants (glm-4.6, glm-4.5 series) with support for features like tool calling, vision capabilities, and web search.
Key Changes
- Added Z.AI model provider plugin with LLM support for 8 GLM model variants
- Implemented text embedding models (though currently commented out in configuration)
- Configured model parameters including temperature, top_p, max_tokens, web search, and thinking mode
Reviewed Changes
Copilot reviewed 23 out of 29 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements.txt | Defines dependencies including dify_plugin, zai-sdk, and pydantic |
| provider/z_ai.yaml | Provider configuration with credential schema and model definitions |
| provider/z_ai.py | Provider validation logic using glm-4.5-flash for credential testing |
| models/_common.py | Common utility class for credential transformation and error mapping |
| models/llm/llm.py | LLM implementation with message handling, streaming, and tool support |
| models/llm/*.yaml | Configuration files for 8 GLM model variants with pricing and parameters |
| models/text_embedding/* | Text embedding implementation (currently disabled) |
| manifest.yaml | Plugin manifest with version 0.0.1 and Python 3.12 configuration |
| main.py | Plugin entry point |
| _assets/* | Icon files for the provider |
| README.md | Documentation for setup and API key configuration |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| message_text = f"{human_prompt} {content}" | ||
| elif isinstance(message, AssistantPromptMessage): | ||
| message_text = f"{ai_prompt} {content}" | ||
| elif isinstance(message, SystemPromptMessage | ToolPromptMessage): |
Copilot
AI
Nov 20, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The isinstance check on line 505 uses a union pattern SystemPromptMessage | ToolPromptMessage which is only supported in Python 3.10+. However, the manifest.yaml specifies Python 3.12, so this is acceptable. For broader compatibility, consider using isinstance(message, (SystemPromptMessage, ToolPromptMessage)) tuple syntax instead.
| elif isinstance(message, SystemPromptMessage | ToolPromptMessage): | |
| elif isinstance(message, (SystemPromptMessage, ToolPromptMessage)): |
|
It seems the latest commit is not related to this PR |
May I confirm which commits were you referring to? I rebased upstream while making new changes. In case of any further misleading problems, I dropped this commit. |
|
LGTM now, all commit and modified files is now under z_ai folder. |
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new model provider for Z.AI, including support for several large language models and vision models, with initial code for text embedding. The implementation is well-organized, but I have identified several areas for improvement. Key issues include a broken image link in the documentation, an inconsistency in the manifest file regarding text embedding permissions, and an unimplemented error mapping that could lead to unhandled exceptions. Additionally, there are some minor typos, an overly broad exception catch, and a significant performance issue in the text embedding implementation where texts are not batched for API calls. Addressing these points will improve the robustness and quality of the new provider.
| def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]: | ||
| """ | ||
| Map model invoke error to unified error | ||
| The key is the error type thrown to the caller | ||
| The value is the error type thrown by the model, | ||
| which needs to be converted into a unified error type for the caller. | ||
|
|
||
| :return: Invoke error mapping | ||
| """ | ||
| return { | ||
| InvokeConnectionError: [], | ||
| InvokeServerUnavailableError: [], | ||
| InvokeRateLimitError: [], | ||
| InvokeAuthorizationError: [], | ||
| InvokeBadRequestError: [], | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _invoke_error_mapping dictionary is empty. This is intended to map provider-specific exceptions to the framework's unified InvokeError types. Without this mapping, exceptions from the zai-sdk will not be handled gracefully by the framework, potentially causing crashes. Please populate this mapping with the relevant exceptions from the zai-sdk. For example, you should map zai.error.APIStatusError to the appropriate InvokeError subclasses.
| embeddings = [] | ||
| embedding_used_tokens = 0 | ||
| for text in texts: | ||
| response = client.embeddings.create(model=model, input=text) | ||
| data = response.data[0] | ||
| embeddings.append(data.embedding) | ||
| embedding_used_tokens += response.usage.total_tokens | ||
| return ([list(map(float, e)) for e in embeddings], embedding_used_tokens) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The embed_documents method makes a separate API call for each text in a loop. This is inefficient and can lead to poor performance and rate limiting issues. The ZhipuAI embeddings API supports batching multiple texts in a single request. Please refactor this to send all texts in a single API call.
| embeddings = [] | |
| embedding_used_tokens = 0 | |
| for text in texts: | |
| response = client.embeddings.create(model=model, input=text) | |
| data = response.data[0] | |
| embeddings.append(data.embedding) | |
| embedding_used_tokens += response.usage.total_tokens | |
| return ([list(map(float, e)) for e in embeddings], embedding_used_tokens) | |
| if not texts: | |
| return [], 0 | |
| response = client.embeddings.create(model=model, input=texts) | |
| embeddings = [data.embedding for data in response.data] | |
| embedding_used_tokens = response.usage.total_tokens | |
| return [list(map(float, e)) for e in embeddings], embedding_used_tokens |
| # Configure | ||
| After installation, you need to get API keys from [Z.AI](https://z.ai/manage-apikey/apikey-list) and setup in Settings -> Model Provider. | ||
|
|
||
|  No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| model: | ||
| enabled: true | ||
| llm: true | ||
| text_embedding: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
|
||
| from .._common import _CommonZhipuaiAI | ||
|
|
||
| viso_models = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ) | ||
| try: | ||
| schema = json.loads(json_schema) | ||
| except Exception: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 23 out of 29 changed files in this pull request and generated 11 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except CredentialsValidateFailedError as ex: | ||
| raise ex |
Copilot
AI
Nov 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling re-raises exceptions without adding context. Lines 21-25 catch exceptions just to re-raise them, which adds no value. Either add meaningful error context/logging or remove this redundant try-except block.
| except CredentialsValidateFailedError as ex: | |
| raise ex |
| @@ -0,0 +1,3 @@ | |||
| dify_plugin<0.6.0,>=0.5.0 | |||
| zai-sdk>=0.0.3.1 | |||
| pydantic==2.8.2 | |||
Copilot
AI
Nov 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pinning pydantic==2.8.2 to a specific version can cause dependency conflicts with other packages. Consider using a version range like pydantic>=2.8.2,<3.0.0 to allow for compatible updates while maintaining stability.
| pydantic==2.8.2 | |
| pydantic>=2.8.2,<3.0.0 |
| def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]: | ||
| """ | ||
| Map model invoke error to unified error | ||
| The key is the error type thrown to the caller | ||
| The value is the error type thrown by the model, | ||
| which needs to be converted into a unified error type for the caller. | ||
|
|
||
| :return: Invoke error mapping | ||
| """ | ||
| return { | ||
| InvokeConnectionError: [], | ||
| InvokeServerUnavailableError: [], | ||
| InvokeRateLimitError: [], | ||
| InvokeAuthorizationError: [], | ||
| InvokeBadRequestError: [], | ||
| } |
Copilot
AI
Nov 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _invoke_error_mapping property returns empty lists for all error types. This means no SDK-specific exceptions are being mapped to Dify's unified error types, which could result in poor error handling and unclear error messages to users when the API fails.
Related Issues or Context
This PR contains Changes to Model-Plugin
This PR contains Changes to Non-LLM Models Plugin
This PR contains Changes to LLM Models Plugin
Version Control (Any Changes to the Plugin Will Require Bumping the Version)
VersionField, Not in Meta Section)Dify Plugin SDK Version
dify_plugin>=0.3.0,<0.6.0is in requirements.txt (SDK docs)Environment Verification (If Any Code Changes)
Local Deployment Environment
SaaS Environment