Skip to content

Conversation

@ous50
Copy link

@ous50 ous50 commented Nov 20, 2025

Related Issues or Context

This PR contains Changes to Model-Plugin

  • Documentation
  • Other

This PR contains Changes to Non-LLM Models Plugin

  • I have Run Comprehensive Tests Relevant to My Changes

This PR contains Changes to LLM Models Plugin

  • My Changes Affect Message Flow Handling (System Messages and User→Assistant Turn-Taking)
  • My Changes Affect Tool Interaction Flow (Multi-Round Usage and Output Handling, for both Agent App and Agent Node)
  • My Changes Affect Multimodal Input Handling (Images, PDFs, Audio, Video, etc.)
  • My Changes Affect Multimodal Output Generation (Images, Audio, Video, etc.)
  • My Changes Affect Structured Output Format (JSON, XML, etc.)
  • My Changes Affect Token Consumption Metrics
  • My Changes Affect Other LLM Functionalities (Reasoning Process, Grounding, Prompt Caching, etc.)
  • Other Changes (Add New Models, Fix Model Parameters etc.)
CleanShot 2025-11-20 at 17 37 04@2x CleanShot 2025-11-20 at 17 27 11@2x CleanShot 2025-11-20 at 17 24 21@2x

Version Control (Any Changes to the Plugin Will Require Bumping the Version)

  • I have Bumped Up the Version in Manifest.yaml (Top-Level Version Field, Not in Meta Section)

Dify Plugin SDK Version

  • I have Ensured dify_plugin>=0.3.0,<0.6.0 is in requirements.txt (SDK docs)

Environment Verification (If Any Code Changes)

Local Deployment Environment

  • Dify Version is: , I have Tested My Changes on Local Deployment Dify with a Clean Environment That Matches the Production Configuration.

SaaS Environment

  • I have Tested My Changes on cloud.dify.ai with a Clean Environment That Matches the Production Configuration

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 20, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ous50, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the Z.Ai model provider, significantly expanding the available large language models within the system. It integrates a comprehensive suite of Z.Ai's LLMs, including vision-enabled models with multimodal input capabilities, and lays the groundwork for text embedding models. The changes also include detailed parameter configurations for these models, ensuring flexible and robust interaction.

Highlights

  • New Model Provider Integration: The Z.Ai model provider has been integrated, expanding the range of available Large Language Models (LLMs) and laying the groundwork for text embedding models.
  • Addition of Z.Ai LLM Models: Multiple Z.Ai LLM models are now supported, including glm-4.6, glm-4.5, glm-4.5-x, glm-4.5-air, glm-4.5-airx, glm-4.5-flash, glm-4.5v, and glm-4-32b-0414-128k, each with detailed parameter configurations.
  • Multimodal Input Support: The integration includes specific handling for multimodal inputs (images and videos) for Z.Ai vision models like glm-4.5v, enhancing their capabilities.
  • Advanced Model Parameter Handling: New parameters such as web_search, thinking, and response_format are now supported for Z.Ai LLMs, offering more control over model behavior and output.
  • Text Embedding Models (Currently Disabled): Z.Ai text embedding models (embedding-2, embedding-3, text_embedding) have been added to the codebase, although they are currently commented out and not actively enabled in the plugin's manifest.
  • Plugin Versioning and Dependencies: The plugin's version has been updated in manifest.yaml, and new Python dependencies (zai-sdk, pydantic) have been added to requirements.txt.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@dosubot dosubot bot added the enhancement New feature or request label Nov 20, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new model provider, Z.Ai. The implementation is comprehensive, covering multiple models and their configurations. However, there are several areas that require attention to improve performance, reliability, and maintainability. Key issues include a potentially misconfigured memory limit, inefficient API usage for embeddings, incomplete error handling, and some inconsistencies in the configuration files. Addressing these points will significantly enhance the quality and robustness of the new provider.

models:
- provider/z_ai.yaml
resource:
memory: 4194304
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The memory resource limit is set to 4194304 bytes (4MB). This is a very low amount of memory for a Python application that needs to load SDKs and handle API requests. It's highly likely to cause Out Of Memory (OOM) errors, leading to plugin crashes. Please increase this limit to a more reasonable value, for example, 67108864 (64MB) or higher, depending on the plugin's actual memory footprint.

Comment on lines +34 to +40
return {
InvokeConnectionError: [],
InvokeServerUnavailableError: [],
InvokeRateLimitError: [],
InvokeAuthorizationError: [],
InvokeBadRequestError: [],
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _invoke_error_mapping dictionary is empty. This means that specific errors from the zai-sdk (like authentication errors, rate limits, or server errors) will not be mapped to Dify's standardized InvokeError types. Instead, they will be caught as generic exceptions, leading to poor error handling and unhelpful error messages for the user. Please populate this mapping with the appropriate exception types from the zai-sdk.

For example (the actual exception names from zai-sdk might differ):

from zai.error import AuthenticationError, RateLimitError, APIConnectionError, APIStatusError

...

    return {
        InvokeConnectionError: [APIConnectionError],
        InvokeServerUnavailableError: [APIStatusError],
        InvokeRateLimitError: [RateLimitError],
        InvokeAuthorizationError: [AuthenticationError],
        InvokeBadRequestError: [],
    }

default: true
- name: max_tokens
use_template: max_tokens
default: 16384
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The default value for max_tokens is set to 16384, which is the maximum allowed value for this model. This is a risky default as it can lead to unexpectedly high token consumption and costs for users who don't explicitly set a lower value. It's recommended to set a more conservative default, such as 4096.

    default: 4096

"""
extra_model_kwargs = {}
# request to glm-4v-plus with stop words will always respond "finish_reason":"network_error"
if stop and model != "glm-4v-plus":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The code checks for model != "glm-4v-plus" to apply stop words, with a comment indicating this model has issues. However, glm-4v-plus is not a model provided by z_ai according to the configuration files. This appears to be a copy-paste error from another provider. If any of the z_ai vision models (like glm-4.5v) have similar issues with the stop parameter, the check should be updated to use the correct model name. Otherwise, this logic is incorrect and could lead to unexpected behavior.

Comment on lines +83 to +90
embeddings = []
embedding_used_tokens = 0
for text in texts:
response = client.embeddings.create(model=model, input=text)
data = response.data[0]
embeddings.append(data.embedding)
embedding_used_tokens += response.usage.total_tokens
return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The embed_documents method iterates through the list of texts and makes a separate API call for each one. This is highly inefficient and will lead to poor performance, especially for a large number of documents. Most embedding APIs, including ZhipuAI's, support batching, where you can pass a list of texts in a single API call. Please refactor this to make a single batch request to improve performance.

Suggested change
embeddings = []
embedding_used_tokens = 0
for text in texts:
response = client.embeddings.create(model=model, input=text)
data = response.data[0]
embeddings.append(data.embedding)
embedding_used_tokens += response.usage.total_tokens
return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)
if not texts:
return [], 0
# Assuming the API supports batching. Please verify with zai-sdk documentation.
response = client.embeddings.create(model=model, input=texts)
embeddings = [data.embedding for data in response.data]
embedding_used_tokens = response.usage.total_tokens
return [list(map(float, e)) for e in embeddings], embedding_used_tokens

# Configure
After installation, you need to get API keys from [Z.AI](https://z.ai/manage-apikey/apikey-list) and setup in Settings -> Model Provider.

![](_assets/z.ai.png) No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The image _assets/z.ai.png is referenced here, but the file itself has not been added to the repository. This will result in a broken image link in the documentation. Please add the image file to the _assets directory to ensure the documentation renders correctly.

model:
enabled: true
llm: true
text_embedding: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The permission for text_embedding is set to true, but the text embedding model type and its source files are commented out in provider/z_ai.yaml. This creates an inconsistency. If text embedding is not yet supported, this permission should be set to false to avoid confusion and potential issues with the platform's permission handling.

Comment on lines +73 to +75
input: '0.00'
output: '0.00'
cached: '0.00'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pricing for this model is set to 0.00 for input, output, and cached tokens. If this model is indeed free to use, this is correct. However, if this is a placeholder, it should be updated with the actual pricing information to ensure correct billing and cost estimation for users.

@@ -0,0 +1,3 @@
dify_plugin<0.6.0,>=0.5.0
zai-sdk>=0.0.3.1
pydantic==2.8.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pydantic library is pinned to a specific version (==2.8.2). This can lead to dependency conflicts with dify_plugin or other plugins in the ecosystem, which may require a different version of pydantic. It is generally better to specify a compatible range (e.g., pydantic>=2.8.2,<3.0.0) or remove the pin entirely if it's not strictly necessary, allowing the dependency resolver to find a compatible version.

pydantic>=2.8.2,<3.0.0

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for Z.AI as a new model provider, enabling integration with Z.AI's ChatGLM language models. The implementation includes configuration for multiple GLM model variants (glm-4.6, glm-4.5 series) with support for features like tool calling, vision capabilities, and web search.

Key Changes

  • Added Z.AI model provider plugin with LLM support for 8 GLM model variants
  • Implemented text embedding models (though currently commented out in configuration)
  • Configured model parameters including temperature, top_p, max_tokens, web search, and thinking mode

Reviewed Changes

Copilot reviewed 23 out of 29 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
requirements.txt Defines dependencies including dify_plugin, zai-sdk, and pydantic
provider/z_ai.yaml Provider configuration with credential schema and model definitions
provider/z_ai.py Provider validation logic using glm-4.5-flash for credential testing
models/_common.py Common utility class for credential transformation and error mapping
models/llm/llm.py LLM implementation with message handling, streaming, and tool support
models/llm/*.yaml Configuration files for 8 GLM model variants with pricing and parameters
models/text_embedding/* Text embedding implementation (currently disabled)
manifest.yaml Plugin manifest with version 0.0.1 and Python 3.12 configuration
main.py Plugin entry point
_assets/* Icon files for the provider
README.md Documentation for setup and API key configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

message_text = f"{human_prompt} {content}"
elif isinstance(message, AssistantPromptMessage):
message_text = f"{ai_prompt} {content}"
elif isinstance(message, SystemPromptMessage | ToolPromptMessage):
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The isinstance check on line 505 uses a union pattern SystemPromptMessage | ToolPromptMessage which is only supported in Python 3.10+. However, the manifest.yaml specifies Python 3.12, so this is acceptable. For broader compatibility, consider using isinstance(message, (SystemPromptMessage, ToolPromptMessage)) tuple syntax instead.

Suggested change
elif isinstance(message, SystemPromptMessage | ToolPromptMessage):
elif isinstance(message, (SystemPromptMessage, ToolPromptMessage)):

Copilot uses AI. Check for mistakes.
@crazywoola
Copy link
Member

It seems the latest commit is not related to this PR

@ous50
Copy link
Author

ous50 commented Nov 22, 2025

It seems the latest commit is not related to this PR

May I confirm which commits were you referring to? I rebased upstream while making new changes.

In case of any further misleading problems, I dropped this commit.

@alterxyz
Copy link
Contributor

LGTM now, all commit and modified files is now under z_ai folder.

@crazywoola
Copy link
Member

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new model provider for Z.AI, including support for several large language models and vision models, with initial code for text embedding. The implementation is well-organized, but I have identified several areas for improvement. Key issues include a broken image link in the documentation, an inconsistency in the manifest file regarding text embedding permissions, and an unimplemented error mapping that could lead to unhandled exceptions. Additionally, there are some minor typos, an overly broad exception catch, and a significant performance issue in the text embedding implementation where texts are not batched for API calls. Addressing these points will improve the robustness and quality of the new provider.

Comment on lines +25 to +40
def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
"""
Map model invoke error to unified error
The key is the error type thrown to the caller
The value is the error type thrown by the model,
which needs to be converted into a unified error type for the caller.

:return: Invoke error mapping
"""
return {
InvokeConnectionError: [],
InvokeServerUnavailableError: [],
InvokeRateLimitError: [],
InvokeAuthorizationError: [],
InvokeBadRequestError: [],
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The _invoke_error_mapping dictionary is empty. This is intended to map provider-specific exceptions to the framework's unified InvokeError types. Without this mapping, exceptions from the zai-sdk will not be handled gracefully by the framework, potentially causing crashes. Please populate this mapping with the relevant exceptions from the zai-sdk. For example, you should map zai.error.APIStatusError to the appropriate InvokeError subclasses.

Comment on lines +83 to +90
embeddings = []
embedding_used_tokens = 0
for text in texts:
response = client.embeddings.create(model=model, input=text)
data = response.data[0]
embeddings.append(data.embedding)
embedding_used_tokens += response.usage.total_tokens
return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The embed_documents method makes a separate API call for each text in a loop. This is inefficient and can lead to poor performance and rate limiting issues. The ZhipuAI embeddings API supports batching multiple texts in a single request. Please refactor this to send all texts in a single API call.

Suggested change
embeddings = []
embedding_used_tokens = 0
for text in texts:
response = client.embeddings.create(model=model, input=text)
data = response.data[0]
embeddings.append(data.embedding)
embedding_used_tokens += response.usage.total_tokens
return ([list(map(float, e)) for e in embeddings], embedding_used_tokens)
if not texts:
return [], 0
response = client.embeddings.create(model=model, input=texts)
embeddings = [data.embedding for data in response.data]
embedding_used_tokens = response.usage.total_tokens
return [list(map(float, e)) for e in embeddings], embedding_used_tokens

# Configure
After installation, you need to get API keys from [Z.AI](https://z.ai/manage-apikey/apikey-list) and setup in Settings -> Model Provider.

![](_assets/z.ai.png) No newline at end of file
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The image _assets/z.ai.png referenced here is not included in the pull request, which will result in a broken image in the rendered README. Please add the image file to the _assets directory.

model:
enabled: true
llm: true
text_embedding: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The permission for text_embedding is enabled (true), but the text embedding models and provider sources are commented out in provider/z_ai.yaml. This is inconsistent. If text embedding is not yet supported, this permission should be set to false.

      text_embedding: false


from .._common import _CommonZhipuaiAI

viso_models = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The variable viso_models appears to have a typo. It should probably be vision_models to better reflect its purpose of listing vision-capable models.

Suggested change
viso_models = [
vision_models = [

)
try:
schema = json.loads(json_schema)
except Exception:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a generic Exception is too broad and can hide other unexpected issues. It's better to catch the more specific json.JSONDecodeError that can be raised by json.loads.

Suggested change
except Exception:
except json.JSONDecodeError:

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 29 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +21 to +22
except CredentialsValidateFailedError as ex:
raise ex
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling re-raises exceptions without adding context. Lines 21-25 catch exceptions just to re-raise them, which adds no value. Either add meaningful error context/logging or remove this redundant try-except block.

Suggested change
except CredentialsValidateFailedError as ex:
raise ex

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,3 @@
dify_plugin<0.6.0,>=0.5.0
zai-sdk>=0.0.3.1
pydantic==2.8.2
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pinning pydantic==2.8.2 to a specific version can cause dependency conflicts with other packages. Consider using a version range like pydantic>=2.8.2,<3.0.0 to allow for compatible updates while maintaining stability.

Suggested change
pydantic==2.8.2
pydantic>=2.8.2,<3.0.0

Copilot uses AI. Check for mistakes.
Comment on lines +25 to +40
def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
"""
Map model invoke error to unified error
The key is the error type thrown to the caller
The value is the error type thrown by the model,
which needs to be converted into a unified error type for the caller.

:return: Invoke error mapping
"""
return {
InvokeConnectionError: [],
InvokeServerUnavailableError: [],
InvokeRateLimitError: [],
InvokeAuthorizationError: [],
InvokeBadRequestError: [],
}
Copy link

Copilot AI Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _invoke_error_mapping property returns empty lists for all error types. This means no SDK-specific exceptions are being mapped to Dify's unified error types, which could result in poor error handling and unclear error messages to users when the API fails.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants