Optimize Qwen3.5 by lzhangzz · Pull Request #4434 · InternLM/lmdeploy

lzhangzz · 2026-03-19T09:26:18Z

This PR improves TurboMind inference performance for Qwen3.5 models with recurrent/linear attention layers (GatedDeltaNet).

Bug Fixes

Fix number of KV-cached layers and MoE router type for Qwen3.5
Fix linear attention state management across requests
Guard stateful features to avoid incorrect behavior

Kernel Optimizations

Implement persistent kernel for GatedDeltaNet
Optimize conv1d and recurrent gated delta rule kernels
Add a serial chunked GDR kernel with benchmark utilities
Refactor invokeRMSNormGated to use Tensor references

Scheduling & State Management

Improve scheduling strategy for recurrent states
Batch GDN (GatedDeltaNet) execution
Refactor recurrent state management and lifecycle

…plify parameter handling. Update GatedDeltaNetLayer to utilize the new function signature for improved clarity and performance.

tuilakhanh · 2026-03-20T03:24:12Z

Failed to load 27B model, run with --tp 2.
[TM][FATAL] models/llama/gated_delta_net_kernels.cu(1249): Check failed: conv_dim % ch_per_blk == 0

Copilot

Pull request overview

This PR targets higher TurboMind inference throughput for Qwen3.5 models by fixing linear-attention/MoE configuration details, introducing new persistent/batched CUDA kernels for GatedDeltaNet, and refactoring state/cache management to support the updated execution/scheduling flow.

Changes:

Add GatedDeltaNet batched/persistent kernels (conv1d+SiLU, recurrent rule v2/v3, chunked prefill) and update call sites to use Tensor/Buffer refs.
Move GatedDeltaNet persistent state from per-request storage to sequence-managed pooled state slots; add cache/state invalidation guards.
Update Qwen3.5 export/model metadata and KV-cache layer indexing to account for mixed layer types.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/turbomind/turbomind.cc	Sets linear attention state dtype and blocks prefix-caching when linear attention is present.
src/turbomind/models/llama/unified_attention_layer.h	Adds `cache_layer_ids_` for remapping layer IDs used by KV cache logic.
src/turbomind/models/llama/unified_attention_layer.cc	Builds layer-id remap and uses it in attention decode/prefill params.
src/turbomind/models/llama/moe_ffn_layer.cc	Adjusts routing-path selection logic for MoE gating.
src/turbomind/models/llama/llama_params.h	Adds `linear_state_dtype` and helper `HasLinearAttention`.
src/turbomind/models/llama/gated_delta_net_kernels.h	Refactors kernel APIs to Tensor/Buffer reference-based interfaces and adds new batched launchers.
src/turbomind/models/llama/gated_delta_net_kernels.cu	Implements new v2/v3 recurrent kernels, chunked prefill kernel, persistent conv1d+SiLU, and refactors helper kernels.
src/turbomind/models/llama/bench_gated_delta_net.cc	Adds benchmark/correctness comparison utility for Gated Delta Rule kernels.
src/turbomind/models/llama/bench_conv1d_silu.cc	Adds benchmark plus CPU reference correctness checker for conv1d+SiLU kernel.
src/turbomind/models/llama/SequenceManager.h	Adds sequence-owned linear attention state fields and pooled-slot bookkeeping.
src/turbomind/models/llama/SequenceManager.cc	Implements pooled slot allocation, cache/state invalidation, and adjusts cache-layer accounting for linear layers.
src/turbomind/models/llama/GatedDeltaNetWeight.h	Updates conv1d weight layout comment.
src/turbomind/models/llama/GatedDeltaNetWeight.cc	Builds fused projection weight and transposes conv1d weights to kernel-preferred layout.
src/turbomind/models/llama/GatedDeltaNetLayer.h	Extends per-phase data to include offsets/state ptr arrays and adds dual-stream execution resources.
src/turbomind/models/llama/GatedDeltaNetLayer.cc	Switches to pooled sequence states and launches new batched/persistent kernels with mixed decode/prefill scheduling.
src/turbomind/models/CMakeLists.txt	Adds CUDA compile flags and registers new benchmark executables under BUILD_TEST.
src/turbomind/kernels/gemm/test/testbed_v3.h	Updates LlamaLinear construction usage in tests.
src/turbomind/kernels/gemm/test/test_utils.cu	Extends FastCompare dispatch/instantiations to support float.
src/turbomind/kernels/attention/test_attention.cu	Adds `is_share_kv()` to satisfy block layout interface expectations.
src/turbomind/kernels/attention/CMakeLists.txt	Fixes test target linkage to depend on `models`.
src/turbomind/engine/request.h	Removes per-request linear attention state fields from RequestCache.
src/turbomind/engine/engine.cc	Wires SequenceManager ctor changes, adds stateless guard for linear attention, and integrates pooled state slot acquisition/invalidation.
lmdeploy/turbomind/deploy/source_model/qwen.py	Adjusts exported Qwen3.5 MoE routing metadata and exports linear attention parameters.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-20T03:41:57Z

lmdeploy/turbomind/deploy/source_model/qwen.py

@@ -367,7 +367,7 @@ def model_info(self):
            info['inter_size'] = shared_expert_size
        info['moe_shared_gate'] = True
        # Qwen3.5 uses sigmoid MoE routing (not softmax)


The inline comment says Qwen3.5 uses sigmoid MoE routing, but the code sets info['scoring_func'] = 'softmax'. Please either update the comment to match the implementation, or switch the value back to 'sigmoid' if that is the intended router behavior (and ensure the TurboMind MoE gate path supports it).

Suggested change

# Qwen3.5 uses sigmoid MoE routing (not softmax)

# Qwen3.5 uses softmax MoE routing

Copilot · 2026-03-20T03:41:58Z

lmdeploy/turbomind/deploy/source_model/qwen.py

@@ -448,7 +448,7 @@ def model_info(self):
        info['inter_size'] = cfg.get('shared_expert_intermediate_size', 0)
        info['moe_shared_gate'] = True
        # Qwen3.5 uses sigmoid MoE routing (not softmax)


The inline comment says Qwen3.5 uses sigmoid MoE routing, but the code sets info['scoring_func'] = 'softmax'. Please either update the comment to match the implementation, or switch the value back to 'sigmoid' if that is the intended router behavior (and ensure the TurboMind MoE gate path supports it).

Suggested change

# Qwen3.5 uses sigmoid MoE routing (not softmax)

# Qwen3.5 MoE routing uses softmax scoring

Copilot · 2026-03-20T03:41:58Z

src/turbomind/models/llama/SequenceManager.h

+    // Gated DeltaNet linear attention persistent states (e.g. Qwen3.5-MoE).
+    // Allocated on first request, preserved across requests for the same session,
+    // and freed automatically when the sequence is erased from the SequenceManager.
+    //   conv_states:      (num_linear_layers, conv_dim, d_conv) — per-channel rolling conv history


The comment describing conv_states shape doesn't match the actual allocation in SequenceManager (pooled_conv_states_ is sized as [max_batch_size, num_linear_layers, d_conv, conv_dim], so per-sequence it is [num_linear_layers, d_conv, conv_dim]). Please update the comment to reflect the correct dimension order to avoid misuse by future callers.

Suggested change

// conv_states: (num_linear_layers, conv_dim, d_conv) — per-channel rolling conv history

// conv_states: (num_linear_layers, d_conv, conv_dim) — per-channel rolling conv history

lzhangzz · 2026-03-20T04:57:30Z

@tuilakhanh

Failed to load 27B model, run with --tp 2.
[TM][FATAL] models/llama/gated_delta_net_kernels.cu(1249): Check failed: conv_dim % ch_per_blk == 0

fixed in c83d2d7

lingyezhixing · 2026-03-20T08:39:58Z

Cannot load Qwen3.5-27B-AWQ on a single V100-SXM2-32G card, it always runs into GPU memory overflow, even with --cache-max-entry-count 0.1 --session-len 2048.

[TM][WARNING] [TM] `max_context_token_num` is not set, default to 2048.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
2026-03-20 16:33:46,556 - lmdeploy - WARNING - turbomind.py:246 - get 1197 model params
[TM][ERROR] CUDA runtime error: out of memory D:\LLM\lmdeploy\src\turbomind\core\allocator.cc:49

lzhangzz · 2026-03-20T09:01:38Z

@lingyezhixing

Cannot load Qwen3.5-27B-AWQ on a single V100-SXM2-32G card, it always runs into GPU memory overflow, even with --cache-max-entry-count 0.1 --session-len 2048.

Try to reduce --max-batch-size, currently all linear states for max batch size is going to be allocated at once. --log-level INFO will print memory usage of linear states & kv cache

tuilakhanh · 2026-03-20T09:30:50Z

@tuilakhanh

Failed to load 27B model, run with --tp 2.
[TM][FATAL] models/llama/gated_delta_net_kernels.cu(1249): Check failed: conv_dim % ch_per_blk == 0

fixed in c83d2d7

Fixed, successful run 35B-A3B, 122B-A10B-AWQ and 27B with v100. Performance is also improved too much when compare with current master.

lvhan028 · 2026-03-20T10:08:25Z

Cannot load Qwen3.5-27B-AWQ on a single V100-SXM2-32G card, it always runs into GPU memory overflow, even with --cache-max-entry-count 0.1 --session-len 2048.

[TM][WARNING] [TM] `max_context_token_num` is not set, default to 2048.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
2026-03-20 16:33:46,556 - lmdeploy - WARNING - turbomind.py:246 - get 1197 model params
[TM][ERROR] CUDA runtime error: out of memory D:\LLM\lmdeploy\src\turbomind\core\allocator.cc:49

May try the following command:

 lmdeploy serve api_server QuantTrio/Qwen3.5-27B-AWQ --tp 1 --log-level INFO --backend turbomind --max-batch-size 1 --max-prefill-token-num 2048 --cache-max-entry-count 0.75 --session-len 64000

lingyezhixing · 2026-03-20T10:35:16Z

Confirmed that it can run, but there's no speed improvement compared to the current main branch. Could this be a specific issue on the Windows platform?

@echo off
set CUDA_VISIBLE_DEVICES=1
chcp 65001

conda activate lmdeploy && lmdeploy serve api_server E:\models\LLM\Qwen3.5-27B-AWQ --tp 1 --log-level INFO --server-name 0.0.0.0 --server-port 8080 --model-name Qwen3.5-27B --backend turbomind --max-batch-size 1 --max-prefill-token-num 2048 --cache-max-entry-count 0.7 --session-len 40960 --tool-call-parser qwen3coder --reasoning-parser qwen-qwq

Log

Active code page: 65001
Add dll path C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin, please note cuda version should >= 11.3 when compiled with cuda 11
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
2026-03-20 18:29:03,469 - lmdeploy - INFO - async_engine.py:105 - input backend=turbomind, backend_config=TurbomindEngineConfig(dtype='auto', model_format=None, tp=1, dp=1, cp=1, device_num=None, attn_tp_size=None, attn_cp_size=None, attn_dp_size=None, mlp_tp_size=None, mlp_dp_size=None, outer_dp_size=None, nnodes=1, node_rank=0, dist_init_addr=None, devices=None, session_len=40960, max_batch_size=1, cache_max_entry_count=0.7, cache_chunk_size=-1, cache_block_seq_len=64, enable_prefix_caching=False, quant_policy=0, rope_scaling_factor=0.0, use_logn_attn=False, download_dir=None, revision=None, max_prefill_token_num=2048, num_tokens_per_iter=0, max_prefill_iters=1, async_=1, empty_init=False, communicator='nccl', hf_overrides=None, enable_metrics=True)
2026-03-20 18:29:03,469 - lmdeploy - INFO - async_engine.py:106 - speculative_config=None
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
`torch_dtype` is deprecated! Use `dtype` instead!
2026-03-20 18:29:05,473 - lmdeploy - INFO - turbomind.py:264 - turbomind model config:

{
  "model_config": {
    "model_name": "",
    "chat_template": "",
    "model_arch": "Qwen3_5ForConditionalGeneration",
    "head_num": 24,
    "kv_head_num": 4,
    "hidden_units": 5120,
    "vocab_size": 248320,
    "embedding_size": 248320,
    "num_layer": 64,
    "inter_size": [
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408,
      17408
    ],
    "norm_eps": 1e-06,
    "attn_bias": 0,
    "mlp_bias": false,
    "window_size": [],
    "attn_sink": false,
    "qk_norm": true,
    "size_per_head": 256,
    "group_size": 128,
    "data_type": "float16",
    "weight_type": "float16",
    "expert_weight_type": "int4",
    "ffn_weight_type": "int4",
    "session_len": 40960,
    "attn_tp_size": 1,
    "attn_cp_size": 1,
    "mlp_tp_size": 1,
    "model_format": "awq",
    "expert_num": [
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0,
      0
    ],
    "expert_router_bias": false,
    "expert_inter_size": 0,
    "experts_per_token": 0,
    "activation_type": "",
    "moe_shared_gate": true,
    "norm_topk_prob": true,
    "routed_scale": 1.0,
    "topk_group": 1,
    "topk_method": "greedy",
    "moe_group_num": 1,
    "scoring_func": "softmax",
    "router_n_groups": -1,
    "q_lora_rank": 0,
    "kv_lora_rank": 0,
    "qk_rope_dim": 0,
    "v_head_dim": 0,
    "layer_types": [
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention",
      "linear_attention",
      "linear_attention",
      "linear_attention",
      "full_attention"
    ],
    "linear_key_head_dim": 128,
    "linear_value_head_dim": 128,
    "linear_conv_kernel_dim": 4,
    "linear_num_key_heads": 16,
    "linear_num_value_heads": 48,
    "attn_output_gate": true,
    "unquantized_expert_layers": [
      0
    ],
    "tune_layer_num": 1
  },
  "attention_config": {
    "softmax_scale": 0.0,
    "cache_block_seq_len": 64,
    "use_logn_attn": 0,
    "max_position_embeddings": 262144,
    "rope_param": {
      "type": "mrope",
      "base": 10000000.0,
      "dim": 64,
      "factor": 1.0,
      "max_position_embeddings": null,
      "attention_factor": 1.0,
      "beta_fast": 32,
      "beta_slow": 1,
      "low_freq_factor": null,
      "high_freq_factor": null,
      "original_max_position_embeddings": null,
      "mrope_section": [
        11,
        11,
        10
      ]
    }
  },
  "lora_config": {
    "lora_policy": "",
    "lora_r": 0,
    "lora_scale": 0.0,
    "lora_max_wo_r": 0,
    "lora_rank_pattern": "",
    "lora_scale_pattern": ""
  },
  "engine_config": {
    "dtype": "auto",
    "model_format": "awq",
    "tp": 1,
    "dp": 1,
    "cp": 1,
    "device_num": 1,
    "attn_tp_size": 1,
    "attn_cp_size": 1,
    "attn_dp_size": 1,
    "mlp_tp_size": 1,
    "mlp_dp_size": 1,
    "outer_dp_size": 1,
    "nnodes": 1,
    "node_rank": 0,
    "dist_init_addr": null,
    "devices": [
      0
    ],
    "session_len": 40960,
    "max_batch_size": 1,
    "cache_max_entry_count": 0.7,
    "cache_chunk_size": -1,
    "cache_block_seq_len": 64,
    "enable_prefix_caching": false,
    "quant_policy": 0,
    "rope_scaling_factor": 0.0,
    "use_logn_attn": false,
    "download_dir": null,
    "revision": null,
    "max_prefill_token_num": 2048,
    "num_tokens_per_iter": 0,
    "max_prefill_iters": 1,
    "async_": 1,
    "empty_init": false,
    "communicator": "nccl",
    "hf_overrides": null,
    "enable_metrics": true
  }
}
[TM][WARNING] [TM] `max_context_token_num` is not set, default to 40960.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
The argument `trust_remote_code` is to be used with Auto classes. It has no effect here and is ignored.
You are using a model of type qwen3_5 to instantiate a model of type . This is not supported for all configurations of models and can yield errors.
2026-03-20 18:29:06,193 - lmdeploy - WARNING - turbomind.py:246 - get 1197 model params
[TM][INFO] GatedDeltaNetLayer: num_k=16 num_v=48 k_dim=2048 v_dim=6144 conv_dim=10240 d_conv=4 num_linear_layers=48
[TM][INFO] [SeqMgr] linear-state slot pool initialized: 1 slots
[TM][INFO] [SeqMgr] linear-state per slot: conv 3.75 MB + recurrent 72.00 MB = 75.75 MB
[TM][INFO] [SeqMgr] linear-state combined total: 75.75 MB
[TM][INFO] [SeqMgr] Adjusting block_count: free_before 3062.31 MB, linear 75.75 MB, target 2143.62 MB
[TM][INFO] [SeqMgr] Adjusted block_count to 517
[TM][INFO] [BlockManager] block_size = 4.000 MB
[TM][INFO] [BlockManager] max_block_count = 516
[TM][INFO] [BlockManager] chunk_size = 516
[TM][WARNING] [SegMgr] prefix caching is disabled
[TM][INFO] max cached tokens: 33024
[TM][WARNING] `session_len` truncated to 33024 due to limited KV cache memory
[TM][INFO] set threshold 1 -> 1
[TM][INFO] [Engine] Warm-up lengths: 8, 16, 32, 48, 64, 96, 128, 192, 256, 384, 512, 768, 1024, 1536, 2048, 2049
[TM][INFO] [WarmUp] 8
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 16
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 32
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 48
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 64
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 96
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 128
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 192
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 256
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 384
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 512
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 768
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 1024
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 1536
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 2048
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] 2049
[TM][INFO] [SeqMgr][Create] ID 0
[TM][INFO] [WarmUp] Warm-up finished in 0.46 seconds.
[TM][INFO] set threshold 1 -> 1
2026-03-20 18:29:40,666 - lmdeploy - INFO - async_engine.py:133 - updated backend_config=TurbomindEngineConfig(dtype='auto', model_format='awq', tp=1, dp=1, cp=1, device_num=1, attn_tp_size=1, attn_cp_size=1, attn_dp_size=1, mlp_tp_size=1, mlp_dp_size=1, outer_dp_size=1, nnodes=1, node_rank=0, dist_init_addr=None, devices=[0], session_len=40960, max_batch_size=1, cache_max_entry_count=0.7, cache_chunk_size=-1, cache_block_seq_len=64, enable_prefix_caching=False, quant_policy=0, rope_scaling_factor=0.0, use_logn_attn=False, download_dir=None, revision=None, max_prefill_token_num=2048, num_tokens_per_iter=0, max_prefill_iters=1, async_=1, empty_init=False, communicator='nccl', hf_overrides=None, enable_metrics=True)
2026-03-20 18:29:41,065 - lmdeploy - INFO - async_engine.py:185 - enable metrics, with dp: 1 dp_rank: 0
HINT:    Please open http://0.0.0.0:8080 in a browser for detailed api usage!!!
HINT:    Please open http://0.0.0.0:8080 in a browser for detailed api usage!!!
HINT:    Please open http://0.0.0.0:8080 in a browser for detailed api usage!!!
INFO:     Started server process [28312]
INFO:     Waiting for application startup.
2026-03-20 18:29:41,110 - lmdeploy - INFO - metrics_processor.py:31 - Metrics handler task started.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8080 (Press CTRL+C to quit)
2026-03-20 18:29:45,772 - lmdeploy - INFO - session_manager.py:208 - [SessionManager] session 1 not found. Creating...
INFO:     192.168.50.10:34414 - "POST /v1/chat/completions HTTP/1.1" 200 OK
2026-03-20 18:29:45,783 - lmdeploy - INFO - logger.py:45 - session=Session(id=1, step=0), adapter_name=None, input_tokens=1438, gen_config=GenerationConfig(n=1, max_new_tokens=None, do_sample=True, top_p=1.0, top_k=40, min_p=0.0, temperature=1.0, repetition_penalty=1.0, ignore_eos=False, random_seed=None, stop_words=None, bad_words=None, stop_token_ids=None, bad_token_ids=None, min_new_tokens=None, skip_special_tokens=False, spaces_between_special_tokens=True, logprobs=None, response_format=None, logits_processors=None, output_logits=None, output_last_hidden_state=None, include_stop_str_in_output=False, with_cache=False, preserve_cache=False, migration_request=None, return_routed_experts=False, repetition_ngram_size=0, repetition_ngram_threshold=0), prompt='<|im_start|>system\n# Tools\n\nYou have access to the following functions:\n\n<tools>\n{"description": "Get the current Unix timestamp in seconds.", "name": "get_current_timestamp", "parameters": {"properties": {}, "type": "object"}}\n{"description": "Get the current Unix timestamp, optionally adjusted by days, weeks, months, or years.\\nUse this to calculate timestamps for date filtering in search functions.\\nExamples: \\"last week\\" = weeks_ago=1, \\"3 days ago\\" = days_ago=3, \\"a year ago\\" = years_ago=1", "name": "calculate_timestamp", "parameters": {"properties": {"days_ago": {"default": 0, "description": "Number of days to subtract from current time (default: 0)", "type": "integer"}, "weeks_ago": {"default": 0, "description": "Number of weeks to subtract from current time (default: 0)", "type": "integer"}, "months_ago": {"default": 0, "description": "Number of months to subtract from current time (default: 0)", "type": "integer"}, "years_ago": {"default": 0, "description": "Number of years to subtract from current time (default: 0)", "type": "integer"}}, "type": "object"}}\n{"description": "Search the user\'s notes by title and content.", "name": "search_notes", "parameters": {"properties": {"query": {"description": "The search query to find matching notes", "type": "string"}, "count": {"default": 5, "description": "Maximum number of results to return (default: 5)", "type": "integer"}, "start_timestamp": {"description": "Only include notes updated after this Unix timestamp (seconds)", "type": "integer"}, "end_timestamp": {"description": "Only include notes updated before this Unix timestamp (seconds)", "type": "integer"}}, "required": ["query"], "type": "object"}}\n{"description": "Get the full content of a note by its ID.", "name": "view_note", "parameters": {"properties": {"note_id": {"description": "The ID of the note to retrieve", "type": "string"}}, "required": ["note_id"], "type": "object"}}\n{"description": "Create a new note with the given title and content.", "name": "write_note", "parameters": {"properties": {"title": {"description": "The title of the new note", "type": "string"}, "content": {"description": "The markdown content for the note", "type": "string"}}, "required": ["title", "content"], "type": "object"}}\n{"description": "Update the content of a note. Use this to modify task lists, add notes, or update content.", "name": "replace_note_content", "parameters": {"properties": {"note_id": {"description": "The ID of the note to update", "type": "string"}, "content": {"description": "The new markdown content for the note", "type": "string"}, "title": {"description": "Optional new title for the note", "type": "string"}}, "required": ["note_id", "content"], "type": "object"}}\n</tools>\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>\n\n# 角色与目标\n你是一个具备多种工具调用能力的智能助手。你必须严格遵守以下工具使用规范。\n\n# 工具缺失（权限问题）\n部分工具默认关闭。如果评估任务需要使用某个工具但在列表中找不到它，你需要明确向用户说明需要开启哪个工具的权限，待用户回复开放后，重新检查列表并继续执行任务。\n\n# 核心基本原则\n1. **优先内部知识**：一般情况下，优先使用你自身的内部知识库完成对话，仅在知识缺失或有明确需求时才调用工具。\n2. **禁止过度响应**：如果没有明确的指令，绝不擅自使用笔记、图像或代码工具。\n3. **规范信息获取**：如果自身知识缺失，优先使用网络搜索获取公开信息；绝对禁止把笔记当作外部知识库进行搜索，除非有明确指令指定搜索笔记。\n\n# 工具使用规范指引\n\n### 1. 笔记管理 (Notes)\n- **绝对被动触发**：仅在用户明确发出“记录”、“搜索我的笔记”、“查看笔记”等指令时，才能调用笔记工具。\n\n### 2. 网络搜索与阅读 (Web Search)\n- **时效信息时间前置**：当需要获取强时效性信息（如最新新闻、前沿科技、实时政治等）时，**必须**先调用时间工具获取当前实时时间或换算目标时间，并将其（如年月信息）加入到搜索关键词中。\n- **强制深度阅读**：调用 `search_web` 获取搜索结果后，**禁止**仅依靠简述回答问题。必须挑选最相关的几条结果，调用 `fetch_url` 获取网页完整内容后再进行作答。\n\n### 3. 数学与代码计算 (Code Execution)\n- **精确运算**：在解答数学题、进行数据处理或复杂逻辑计算时，必须调用 `execute_code` 工具在 Python 环境中进行运算，以确保结果绝对精确（注意：仅限 Python 标准库）。\n\n### 4. 图像生成与编辑 (Image Processing)\n- **图像生成 (`generate_image`)**：\n  - 提示词必须极具画面感且细节丰富，详细描述颜色、形状及重要元素（像给盲人描述一样）。\n  - 忠于上下文，不臆造不存在的信息；如遇复杂场景，集中描述最显著的核心元素。\n  - 支持中英文，无特定要求时默认使用中文。\n- **图像编辑 (`edit_image`)**：\n  - 必须极其精准地描述**需要修改的具体局部及其新样貌**。\n  - 必须在提示词中强调**“保持图像其余所有部分完全不变”**。<|im_end|>\n<|im_start|>user\n你好<|im_end|>\n<|im_start|>assistant\n<think>\n', prompt_token_id=[248045, 8678, 198, 2, 13455, 271, 2523, 599, 2528, 310, 279, 2614, 5568, 25, 271, 27, 15449, 29, 198, 4754, 4532, 763, 328, 1882, 279, 1428, 45426, 11112, 303, 6283, 10152, 328, 591, 763, 328, 447, 10757, 22355, 487, 328, 13390, 763, 5046, 12811, 763, 15969, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 1882, 279, 1428, 45426, 11112, 11, 44007, 22643, 539, 2756, 11, 5381, 11, 3818, 11, 466, 1578, 6889, 77, 9947, 411, 310, 10724, 47149, 364, 2321, 28701, 303, 2624, 5568, 6889, 77, 39044, 25, 7018, 4119, 1936, 2037, 283, 5381, 62, 6106, 28, 16, 11, 7018, 18, 2756, 3998, 2037, 283, 2756, 62, 6106, 28, 18, 11, 7018, 64, 1007, 3998, 2037, 283, 1578, 62, 6106, 28, 16, 487, 328, 591, 763, 328, 34420, 22355, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 13382, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 2756, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 78196, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 5381, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 48041, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 3818, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 40342, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 1578, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 8934, 2069, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 5708, 279, 1156, 579, 8129, 539, 2192, 321, 2144, 10152, 328, 591, 763, 328, 1773, 43632, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 1574, 763, 5046, 4532, 763, 328, 760, 2624, 3134, 310, 1423, 12219, 8129, 487, 328, 1267, 763, 328, 889, 13933, 328, 1767, 763, 5046, 2186, 763, 220, 20, 11, 328, 4532, 763, 328, 26427, 1324, 314, 2961, 310, 460, 318, 2186, 25, 220, 20, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 2388, 22355, 763, 5046, 4532, 763, 328, 7081, 2830, 8129, 5860, 1238, 411, 45426, 11112, 318, 16890, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 400, 22355, 763, 5046, 4532, 763, 328, 7081, 2830, 8129, 5860, 1518, 411, 45426, 11112, 318, 16890, 11250, 328, 1267, 763, 328, 11326, 8934, 2069, 328, 6081, 763, 4241, 1574, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 1882, 279, 2400, 2144, 314, 264, 5020, 539, 1141, 2937, 10152, 328, 591, 763, 328, 1015, 26331, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 9679, 816, 763, 5046, 4532, 763, 328, 760, 2937, 314, 279, 5020, 310, 16672, 487, 328, 1267, 763, 328, 889, 8934, 2069, 328, 6081, 763, 4241, 9679, 816, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 3886, 264, 491, 5020, 440, 279, 2574, 2192, 321, 2144, 10152, 328, 591, 763, 328, 4775, 26331, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 2034, 763, 5046, 4532, 763, 328, 760, 2192, 314, 279, 491, 5020, 487, 328, 1267, 763, 328, 889, 13933, 328, 1733, 763, 5046, 4532, 763, 328, 760, 48794, 2144, 364, 279, 5020, 487, 328, 1267, 763, 328, 889, 8934, 2069, 328, 6081, 763, 4241, 2034, 487, 328, 1733, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 4149, 279, 2144, 314, 264, 5020, 13, 5272, 411, 310, 5427, 3274, 11140, 11, 884, 8129, 11, 466, 2560, 2144, 10152, 328, 591, 763, 328, 7899, 26331, 7260, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 9679, 816, 763, 5046, 4532, 763, 328, 760, 2937, 314, 279, 5020, 310, 2560, 487, 328, 1267, 763, 328, 889, 13933, 328, 1733, 763, 5046, 4532, 763, 328, 760, 491, 48794, 2144, 364, 279, 5020, 487, 328, 1267, 763, 328, 889, 13933, 328, 2034, 763, 5046, 4532, 763, 328, 14863, 491, 2192, 364, 279, 5020, 487, 328, 1267, 763, 328, 889, 8934, 2069, 328, 6081, 763, 4241, 9679, 816, 487, 328, 1733, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 510, 15449, 29, 271, 2592, 488, 4992, 310, 1562, 264, 709, 25835, 9559, 303, 279, 2614, 3443, 440, 5486, 19900, 25, 271, 248058, 198, 27, 1628, 28, 8422, 8901, 1224, 29, 198, 27, 15704, 28, 8422, 24109, 62, 16, 29, 198, 927, 62, 16, 198, 510, 15704, 29, 198, 27, 15704, 28, 8422, 24109, 62, 17, 29, 198, 1919, 369, 279, 869, 364, 279, 2018, 5555, 198, 8761, 628, 9111, 198, 34493, 4965, 198, 510, 15704, 29, 198, 510, 1628, 29, 198, 248059, 271, 27, 95328, 29, 198, 92065, 25, 198, 12, 5534, 6526, 26834, 1732, 279, 5024, 3443, 25, 449, 8906, 361, 1628, 28, 1076, 1419, 1628, 29, 2424, 1902, 381, 23283, 2785, 220, 248058, 248059, 11535, 9212, 198, 12, 12296, 4868, 26834, 381, 5024, 198, 12, 1394, 1189, 3300, 9801, 31626, 364, 678, 709, 1562, 303, 5629, 3992, 54588, 279, 709, 1562, 11, 694, 4045, 1238, 198, 12, 1368, 1017, 369, 874, 709, 1562, 2420, 11, 4087, 279, 3296, 1040, 4472, 440, 678, 1428, 6337, 321, 635, 524, 3184, 279, 1156, 883, 709, 6526, 198, 510, 95328, 29, 271, 2, 220, 100561, 95999, 97622, 198, 133653, 98897, 99163, 99445, 109185, 104426, 98409, 110257, 1710, 122117, 115334, 97155, 99445, 96402, 97816, 1710, 271, 2, 220, 99445, 112209, 9616, 104437, 96304, 7313, 198, 96985, 99445, 101908, 100663, 1710, 96604, 99973, 97995, 124699, 109300, 99445, 108750, 145211, 109604, 96540, 3709, 111083, 98290, 96127, 97237, 98419, 96442, 103750, 101857, 134496, 104437, 3709, 96739, 97237, 100535, 98865, 95946, 3709, 100544, 97552, 101831, 96172, 97625, 97328, 97995, 1710, 271, 2, 220, 98144, 130303, 198, 16, 13, 2972, 99575, 98866, 96905, 332, 4960, 113228, 3709, 99575, 96402, 95933, 108920, 98866, 148192, 97235, 104062, 3709, 129555, 96905, 112209, 114503, 98290, 97887, 140092, 109185, 99445, 1710, 198, 17, 13, 2972, 103823, 109212, 101928, 332, 4960, 109842, 113149, 109454, 3709, 116956, 105491, 96402, 100089, 5205, 101878, 96348, 98874, 99445, 1710, 198, 18, 13, 2972, 97816, 96280, 100552, 332, 4960, 96604, 98647, 96905, 112209, 3709, 99575, 96402, 97034, 98287, 100552, 98026, 96280, 24178, 99547, 103823, 96373, 100089, 111653, 104204, 148192, 107305, 3709, 110237, 95762, 98290, 109454, 103864, 98287, 100089, 1710, 271, 2, 220, 99445, 96402, 97816, 104698, 271, 13962, 220, 16, 13, 220, 100089, 96205, 318, 21003, 8, 198, 12, 2972, 99547, 110689, 110852, 332, 4960, 129555, 97237, 98290, 108632, 2005, 98151, 828, 88786, 98287, 96933, 100089, 828, 88786, 97239, 100089, 828, 96024, 109454, 95865, 3709, 97558, 109185, 100089, 99445, 1710, 271, 13962, 220, 17, 13, 220, 97034, 98287, 95999, 97075, 318, 5793, 7304, 8, 198, 12, 2972, 105916, 96280, 96341, 115843, 332, 4960, 96129, 96442, 100552, 96189, 105916, 95911, 96280, 9616, 96031, 97899, 97280, 5205, 105119, 96848, 5205, 101901, 97905, 96024, 7313, 95865, 3709, 332, 97240, 332, 96218, 109185, 96341, 99445, 100552, 99927, 101901, 96341, 96348, 130863, 97622, 96341, 3709, 123118, 9616, 96031, 120891, 96280, 7313, 121152, 98287, 101834, 95789, 1710, 198, 12, 2972, 100915, 99581, 97075, 332, 4960, 109185, 1510, 1773, 25130, 63, 220, 100552, 142093, 95946, 3709, 332, 103823, 332, 96725, 108965, 125328, 123058, 1710, 97240, 109289, 96019, 108598, 127883, 97989, 3709, 109185, 1510, 9353, 2809, 63, 220, 100552, 109007, 99492, 96621, 109717, 96335, 124603, 1710, 271, 13962, 220, 18, 13, 220, 103711, 95999, 98874, 97792, 318, 2010, 29913, 8, 198, 12, 2972, 110248, 104983, 332, 4960, 95772, 101776, 103711, 96040, 5205, 96335, 122971, 96348, 99975, 101842, 97792, 95865, 3709, 97240, 109185, 1510, 9951, 4000, 63, 220, 99445, 95772, 12654, 220, 111937, 96335, 104983, 3709, 112035, 97989, 99547, 110248, 9616, 97120, 4960, 113914, 12654, 220, 97427, 97247, 71748, 271, 13962, 220, 19, 13, 220, 101878, 103910, 95999, 98428, 318, 1841, 27212, 8, 198, 12, 2972, 101878, 103910, 27718, 18779, 4794, 63, 31230, 4960, 198, 220, 471, 220, 99162, 96919, 97240, 112782, 103926, 96111, 97447, 101751, 97502, 3709, 145290, 101617, 5205, 102137, 96128, 96588, 100553, 9616, 96647, 96237, 132896, 99172, 98607, 71748, 198, 220, 471, 220, 125383, 129933, 3709, 95753, 95804, 228, 96238, 110204, 108674, 24178, 120306, 99975, 100653, 3709, 98445, 99172, 96019, 100120, 109446, 100553, 1710, 198, 220, 471, 220, 97273, 134710, 3709, 95979, 109054, 96719, 95865, 101908, 96402, 99986, 1710, 198, 12, 2972, 101878, 98428, 27718, 3468, 4794, 63, 31230, 4960, 198, 220, 471, 220, 97240, 110076, 101804, 95852, 99172, 332, 96442, 103723, 104450, 108813, 101625, 95882, 96206, 98223, 221794, 198, 220, 471, 220, 112787, 99162, 96919, 95789, 99989, 332, 2005, 97559, 101878, 109571, 96983, 96985, 97898, 102069, 173464, 1710, 248046, 198, 248045, 846, 198, 109266, 248046, 198, 248045, 74455, 198, 248068, 198]
2026-03-20 18:29:45,784 - lmdeploy - INFO - async_engine.py:382 - session=1, history_tokens=0, input_tokens=1438, max_new_tokens=39522, seq_start=True, seq_end=True, step=0, prep=True
2026-03-20 18:29:45,784 - lmdeploy - INFO - turbomind.py:687 - [async_stream_infer] session 1 start
[TM][INFO] [SeqMgr][Create] ID 1
[TM][WARNING] [ProcessInferRequests] [1] total sequence length (1438 + 39522) exceeds `session_len` (33024), `max_new_tokens` is truncated to 31586
[2026-03-20 18:29:51 DP0] Avg thr (in/out): 143.2 / 2.6 tokens/s, API server (completed/routed/waiting): 0 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 4.5%,
[2026-03-20 18:30:01 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 0 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 4.7%,
2026-03-20 18:30:02,404 - lmdeploy - INFO - turbomind.py:797 - [async_stream_infer] session 1 done
2026-03-20 18:30:02,404 - lmdeploy - INFO - async_engine.py:505 - session 1 finished, reason "stop", input_tokens 1438, output_tokens 98
INFO:     192.168.50.10:56832 - "POST /v1/chat/completions HTTP/1.1" 404 Not Found
INFO:     192.168.50.10:56834 - "POST /v1/chat/completions HTTP/1.1" 404 Not Found
[2026-03-20 18:30:11 DP0] Avg thr (in/out): 0.0 / 0.9 tokens/s, API server (completed/routed/waiting): 1 / 0 / 0, Engine (running/waiting): 1 / 0, KV cache: 4.8%,
2026-03-20 18:30:53,392 - lmdeploy - INFO - session_manager.py:208 - [SessionManager] session 2 not found. Creating...
INFO:     192.168.50.10:55218 - "POST /v1/chat/completions HTTP/1.1" 200 OK
2026-03-20 18:30:53,396 - lmdeploy - INFO - logger.py:45 - session=Session(id=2, step=0), adapter_name=None, input_tokens=1497, gen_config=GenerationConfig(n=1, max_new_tokens=None, do_sample=True, top_p=1.0, top_k=40, min_p=0.0, temperature=1.0, repetition_penalty=1.0, ignore_eos=False, random_seed=None, stop_words=None, bad_words=None, stop_token_ids=None, bad_token_ids=None, min_new_tokens=None, skip_special_tokens=False, spaces_between_special_tokens=True, logprobs=None, response_format=None, logits_processors=None, output_logits=None, output_last_hidden_state=None, include_stop_str_in_output=False, with_cache=False, preserve_cache=False, migration_request=None, return_routed_experts=False, repetition_ngram_size=0, repetition_ngram_threshold=0), prompt='<|im_start|>system\n# Tools\n\nYou have access to the following functions:\n\n<tools>\n{"description": "Get the current Unix timestamp in seconds.", "name": "get_current_timestamp", "parameters": {"properties": {}, "type": "object"}}\n{"description": "Get the current Unix timestamp, optionally adjusted by days, weeks, months, or years.\\nUse this to calculate timestamps for date filtering in search functions.\\nExamples: \\"last week\\" = weeks_ago=1, \\"3 days ago\\" = days_ago=3, \\"a year ago\\" = years_ago=1", "name": "calculate_timestamp", "parameters": {"properties": {"days_ago": {"default": 0, "description": "Number of days to subtract from current time (default: 0)", "type": "integer"}, "weeks_ago": {"default": 0, "description": "Number of weeks to subtract from current time (default: 0)", "type": "integer"}, "months_ago": {"default": 0, "description": "Number of months to subtract from current time (default: 0)", "type": "integer"}, "years_ago": {"default": 0, "description": "Number of years to subtract from current time (default: 0)", "type": "integer"}}, "type": "object"}}\n{"description": "Search the user\'s notes by title and content.", "name": "search_notes", "parameters": {"properties": {"query": {"description": "The search query to find matching notes", "type": "string"}, "count": {"default": 5, "description": "Maximum number of results to return (default: 5)", "type": "integer"}, "start_timestamp": {"description": "Only include notes updated after this Unix timestamp (seconds)", "type": "integer"}, "end_timestamp": {"description": "Only include notes updated before this Unix timestamp (seconds)", "type": "integer"}}, "required": ["query"], "type": "object"}}\n{"description": "Get the full content of a note by its ID.", "name": "view_note", "parameters": {"properties": {"note_id": {"description": "The ID of the note to retrieve", "type": "string"}}, "required": ["note_id"], "type": "object"}}\n{"description": "Create a new note with the given title and content.", "name": "write_note", "parameters": {"properties": {"title": {"description": "The title of the new note", "type": "string"}, "content": {"description": "The markdown content for the note", "type": "string"}}, "required": ["title", "content"], "type": "object"}}\n{"description": "Update the content of a note. Use this to modify task lists, add notes, or update content.", "name": "replace_note_content", "parameters": {"properties": {"note_id": {"description": "The ID of the note to update", "type": "string"}, "content": {"description": "The new markdown content for the note", "type": "string"}, "title": {"description": "Optional new title for the note", "type": "string"}}, "required": ["note_id", "content"], "type": "object"}}\n</tools>\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>\n\n# 角色与目标\n你是一个具备多种工具调用能力的智能助手。你必须严格遵守以下工具使用规范。\n\n# 工具缺失（权限问题）\n部分工具默认关闭。如果评估任务需要使用某个工具但在列表中找不到它，你需要明确向用户说明需要开启哪个工具的权限，待用户回复开放后，重新检查列表并继续执行任务。\n\n# 核心基本原则\n1. **优先内部知识**：一般情况下，优先使用你自身的内部知识库完成对话，仅在知识缺失或有明确需求时才调用工具。\n2. **禁止过度响应**：如果没有明确的指令，绝不擅自使用笔记、图像或代码工具。\n3. **规范信息获取**：如果自身知识缺失，优先使用网络搜索获取公开信息；绝对禁止把笔记当作外部知识库进行搜索，除非有明确指令指定搜索笔记。\n\n# 工具使用规范指引\n\n### 1. 笔记管理 (Notes)\n- **绝对被动触发**：仅在用户明确发出“记录”、“搜索我的笔记”、“查看笔记”等指令时，才能调用笔记工具。\n\n### 2. 网络搜索与阅读 (Web Search)\n- **时效信息时间前置**：当需要获取强时效性信息（如最新新闻、前沿科技、实时政治等）时，**必须**先调用时间工具获取当前实时时间或换算目标时间，并将其（如年月信息）加入到搜索关键词中。\n- **强制深度阅读**：调用 `search_web` 获取搜索结果后，**禁止**仅依靠简述回答问题。必须挑选最相关的几条结果，调用 `fetch_url` 获取网页完整内容后再进行作答。\n\n### 3. 数学与代码计算 (Code Execution)\n- **精确运算**：在解答数学题、进行数据处理或复杂逻辑计算时，必须调用 `execute_code` 工具在 Python 环境中进行运算，以确保结果绝对精确（注意：仅限 Python 标准库）。\n\n### 4. 图像生成与编辑 (Image Processing)\n- **图像生成 (`generate_image`)**：\n  - 提示词必须极具画面感且细节丰富，详细描述颜色、形状及重要元素（像给盲人描述一样）。\n  - 忠于上下文，不臆造不存在的信息；如遇复杂场景，集中描述最显著的核心元素。\n  - 支持中英文，无特定要求时默认使用中文。\n- **图像编辑 (`edit_image`)**：\n  - 必须极其精准地描述**需要修改的具体局部及其新样貌**。\n  - 必须在提示词中强调**“保持图像其余所有部分完全不变”**。<|im_end|>\n<|im_start|>user\n你好<|im_end|>\n<|im_start|>assistant\n你好！👋 很高兴见到你。\n\n有什么我可以帮助你的吗？无论是回答问题、协助写作、解答疑问，还是其他任何需求，随时告诉我！<|im_end|>\n<|im_start|>user\ngithubissue可展开markdown代码块怎么写？默认折叠，我要放日志<|im_end|>\n<|im_start|>assistant\n<think>\n', prompt_token_id=[248045, 8678, 198, 2, 13455, 271, 2523, 599, 2528, 310, 279, 2614, 5568, 25, 271, 27, 15449, 29, 198, 4754, 4532, 763, 328, 1882, 279, 1428, 45426, 11112, 303, 6283, 10152, 328, 591, 763, 328, 447, 10757, 22355, 487, 328, 13390, 763, 5046, 12811, 763, 15969, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 1882, 279, 1428, 45426, 11112, 11, 44007, 22643, 539, 2756, 11, 5381, 11, 3818, 11, 466, 1578, 6889, 77, 9947, 411, 310, 10724, 47149, 364, 2321, 28701, 303, 2624, 5568, 6889, 77, 39044, 25, 7018, 4119, 1936, 2037, 283, 5381, 62, 6106, 28, 16, 11, 7018, 18, 2756, 3998, 2037, 283, 2756, 62, 6106, 28, 18, 11, 7018, 64, 1007, 3998, 2037, 283, 1578, 62, 6106, 28, 16, 487, 328, 591, 763, 328, 34420, 22355, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 13382, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 2756, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 78196, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 5381, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 48041, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 3818, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 40342, 62, 6106, 763, 5046, 2186, 763, 220, 15, 11, 328, 4532, 763, 328, 2742, 314, 1578, 310, 31192, 494, 1428, 854, 318, 2186, 25, 220, 15, 11250, 328, 1267, 763, 328, 11326, 8934, 2069, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 5708, 279, 1156, 579, 8129, 539, 2192, 321, 2144, 10152, 328, 591, 763, 328, 1773, 43632, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 1574, 763, 5046, 4532, 763, 328, 760, 2624, 3134, 310, 1423, 12219, 8129, 487, 328, 1267, 763, 328, 889, 13933, 328, 1767, 763, 5046, 2186, 763, 220, 20, 11, 328, 4532, 763, 328, 26427, 1324, 314, 2961, 310, 460, 318, 2186, 25, 220, 20, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 2388, 22355, 763, 5046, 4532, 763, 328, 7081, 2830, 8129, 5860, 1238, 411, 45426, 11112, 318, 16890, 11250, 328, 1267, 763, 328, 11326, 13933, 328, 400, 22355, 763, 5046, 4532, 763, 328, 7081, 2830, 8129, 5860, 1518, 411, 45426, 11112, 318, 16890, 11250, 328, 1267, 763, 328, 11326, 8934, 2069, 328, 6081, 763, 4241, 1574, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 1882, 279, 2400, 2144, 314, 264, 5020, 539, 1141, 2937, 10152, 328, 591, 763, 328, 1015, 26331, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 9679, 816, 763, 5046, 4532, 763, 328, 760, 2937, 314, 279, 5020, 310, 16672, 487, 328, 1267, 763, 328, 889, 8934, 2069, 328, 6081, 763, 4241, 9679, 816, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 3886, 264, 491, 5020, 440, 279, 2574, 2192, 321, 2144, 10152, 328, 591, 763, 328, 4775, 26331, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 2034, 763, 5046, 4532, 763, 328, 760, 2192, 314, 279, 491, 5020, 487, 328, 1267, 763, 328, 889, 13933, 328, 1733, 763, 5046, 4532, 763, 328, 760, 48794, 2144, 364, 279, 5020, 487, 328, 1267, 763, 328, 889, 8934, 2069, 328, 6081, 763, 4241, 2034, 487, 328, 1733, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 4754, 4532, 763, 328, 4149, 279, 2144, 314, 264, 5020, 13, 5272, 411, 310, 5427, 3274, 11140, 11, 884, 8129, 11, 466, 2560, 2144, 10152, 328, 591, 763, 328, 7899, 26331, 7260, 487, 328, 13390, 763, 5046, 12811, 763, 5046, 9679, 816, 763, 5046, 4532, 763, 328, 760, 2937, 314, 279, 5020, 310, 2560, 487, 328, 1267, 763, 328, 889, 13933, 328, 1733, 763, 5046, 4532, 763, 328, 760, 491, 48794, 2144, 364, 279, 5020, 487, 328, 1267, 763, 328, 889, 13933, 328, 2034, 763, 5046, 4532, 763, 328, 14863, 491, 2192, 364, 279, 5020, 487, 328, 1267, 763, 328, 889, 8934, 2069, 328, 6081, 763, 4241, 9679, 816, 487, 328, 1733, 7664, 328, 1267, 763, 328, 1640, 29958, 198, 510, 15449, 29, 271, 2592, 488, 4992, 310, 1562, 264, 709, 25835, 9559, 303, 279, 2614, 3443, 440, 5486, 19900, 25, 271, 248058, 198, 27, 1628, 28, 8422, 8901, 1224, 29, 198, 27, 15704, 28, 8422, 24109, 62, 16, 29, 198, 927, 62, 16, 198, 510, 15704, 29, 198, 27, 15704, 28, 8422, 24109, 62, 17, 29, 198, 1919, 369, 279, 869, 364, 279, 2018, 5555, 198, 8761, 628, 9111, 198, 34493, 4965, 198, 510, 15704, 29, 198, 510, 1628, 29, 198, 248059, 271, 27, 95328, 29, 198, 92065, 25, 198, 12, 5534, 6526, 26834, 1732, 279, 5024, 3443, 25, 449, 8906, 361, 1628, 28, 1076, 1419, 1628, 29, 2424, 1902, 381, 23283, 2785, 220, 248058, 248059, 11535, 9212, 198, 12, 12296, 4868, 26834, 381, 5024, 198, 12, 1394, 1189, 3300, 9801, 31626, 364, 678, 709, 1562, 303, 5629, 3992, 54588, 279, 709, 1562, 11, 694, 4045, 1238, 198, 12, 1368, 1017, 369, 874, 709, 1562, 2420, 11, 4087, 279, 3296, 1040, 4472, 440, 678, 1428, 6337, 321, 635, 524, 3184, 279, 1156, 883, 709, 6526, 198, 510, 95328, 29, 271, 2, 220, 100561, 95999, 97622, 198, 133653, 98897, 99163, 99445, 109185, 104426, 98409, 110257, 1710, 122117, 115334, 97155, 99445, 96402, 97816, 1710, 271, 2, 220, 99445, 112209, 9616, 104437, 96304, 7313, 198, 96985, 99445, 101908, 100663, 1710, 96604, 99973, 97995, 124699, 109300, 99445, 108750, 145211, 109604, 96540, 3709, 111083, 98290, 96127, 97237, 98419, 96442, 103750, 101857, 134496, 104437, 3709, 96739, 97237, 100535, 98865, 95946, 3709, 100544, 97552, 101831, 96172, 97625, 97328, 97995, 1710, 271, 2, 220, 98144, 130303, 198, 16, 13, 2972, 99575, 98866, 96905, 332, 4960, 113228, 3709, 99575, 96402, 95933, 108920, 98866, 148192, 97235, 104062, 3709, 129555, 96905, 112209, 114503, 98290, 97887, 140092, 109185, 99445, 1710, 198, 17, 13, 2972, 103823, 109212, 101928, 332, 4960, 109842, 113149, 109454, 3709, 116956, 105491, 96402, 100089, 5205, 101878, 96348, 98874, 99445, 1710, 198, 18, 13, 2972, 97816, 96280, 100552, 332, 4960, 96604, 98647, 96905, 112209, 3709, 99575, 96402, 97034, 98287, 100552, 98026, 96280, 24178, 99547, 103823, 96373, 100089, 111653, 104204, 148192, 107305, 3709, 110237, 95762, 98290, 109454, 103864, 98287, 100089, 1710, 271, 2, 220, 99445, 96402, 97816, 104698, 271, 13962, 220, 16, 13, 220, 100089, 96205, 318, 21003, 8, 198, 12, 2972, 99547, 110689, 110852, 332, 4960, 129555, 97237, 98290, 108632, 2005, 98151, 828, 88786, 98287, 96933, 100089, 828, 88786, 97239, 100089, 828, 96024, 109454, 95865, 3709, 97558, 109185, 100089, 99445, 1710, 271, 13962, 220, 17, 13, 220, 97034, 98287, 95999, 97075, 318, 5793, 7304, 8, 198, 12, 2972, 105916, 96280, 96341, 115843, 332, 4960, 96129, 96442, 100552, 96189, 105916, 95911, 96280, 9616, 96031, 97899, 97280, 5205, 105119, 96848, 5205, 101901, 97905, 96024, 7313, 95865, 3709, 332, 97240, 332, 96218, 109185, 96341, 99445, 100552, 99927, 101901, 96341, 96348, 130863, 97622, 96341, 3709, 123118, 9616, 96031, 120891, 96280, 7313, 121152, 98287, 101834, 95789, 1710, 198, 12, 2972, 100915, 99581, 97075, 332, 4960, 109185, 1510, 1773, 25130, 63, 220, 100552, 142093, 95946, 3709, 332, 103823, 332, 96725, 108965, 125328, 123058, 1710, 97240, 109289, 96019, 108598, 127883, 97989, 3709, 109185, 1510, 9353, 2809, 63, 220, 100552, 109007, 99492, 96621, 109717, 96335, 124603, 1710, 271, 13962, 220, 18, 13, 220, 103711, 95999, 98874, 97792, 318, 2010, 29913, 8, 198, 12, 2972, 110248, 104983, 332, 4960, 95772, 101776, 103711, 96040, 5205, 96335, 122971, 96348, 99975, 101842, 97792, 95865, 3709, 97240, 109185, 1510, 9951, 4000, 63, 220, 99445, 95772, 12654, 220, 111937, 96335, 104983, 3709, 112035, 97989, 99547, 110248, 9616, 97120, 4960, 113914, 12654, 220, 97427, 97247, 71748, 271, 13962, 220, 19, 13, 220, 101878, 103910, 95999, 98428, 318, 1841, 27212, 8, 198, 12, 2972, 101878, 103910, 27718, 18779, 4794, 63, 31230, 4960, 198, 220, 471, 220, 99162, 96919, 97240, 112782, 103926, 96111, 97447, 101751, 97502, 3709, 145290, 101617, 5205, 102137, 96128, 96588, 100553, 9616, 96647, 96237, 132896, 99172, 98607, 71748, 198, 220, 471, 220, 125383, 129933, 3709, 95753, 95804, 228, 96238, 110204, 108674, 24178, 120306, 99975, 100653, 3709, 98445, 99172, 96019, 100120, 109446, 100553, 1710, 198, 220, 471, 220, 97273, 134710, 3709, 95979, 109054, 96719, 95865, 101908, 96402, 99986, 1710, 198, 12, 2972, 101878, 98428, 27718, 3468, 4794, 63, 31230, 4960, 198, 220, 471, 220, 97240, 110076, 101804, 95852, 99172, 332, 96442, 103723, 104450, 108813, 101625, 95882, 96206, 98223, 221794, 198, 220, 471, 220, 112787, 99162, 96919, 95789, 99989, 332, 2005, 97559, 101878, 109571, 96983, 96985, 97898, 102069, 173464, 1710, 248046, 198, 248045, 846, 198, 109266, 248046, 198, 248045, 74455, 198, 109266, 6115, 9008, 239, 233, 220, 116577, 108888, 95933, 1710, 271, 98691, 95815, 101009, 97319, 98179, 10992, 108656, 123058, 5205, 101782, 108664, 5205, 101776, 100162, 3709, 96984, 96903, 97019, 97887, 3709, 100675, 110576, 6115, 248046, 198, 248045, 846, 198, 5039, 10835, 95824, 101614, 58046, 98874, 97047, 117668, 10992, 101908, 101052, 3709, 101660, 96186, 110509, 248046, 198, 248045, 74455, 198, 248068, 198]
2026-03-20 18:30:53,397 - lmdeploy - INFO - async_engine.py:382 - session=2, history_tokens=0, input_tokens=1497, max_new_tokens=39463, seq_start=True, seq_end=True, step=0, prep=True
2026-03-20 18:30:53,397 - lmdeploy - INFO - turbomind.py:687 - [async_stream_infer] session 2 start
[TM][INFO] [SeqMgr][Create] ID 2
[TM][WARNING] [ProcessInferRequests] [2] total sequence length (1497 + 39463) exceeds `session_len` (33024), `max_new_tokens` is truncated to 31527
[2026-03-20 18:31:01 DP0] Avg thr (in/out): 149.8 / 4.2 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 4.8%,
[2026-03-20 18:31:11 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 5.0%,
[2026-03-20 18:31:21 DP0] Avg thr (in/out): 0.0 / 6.3 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 5.2%,
[2026-03-20 18:31:31 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 5.4%,
[2026-03-20 18:31:41 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 5.6%,
[2026-03-20 18:31:51 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 5.8%,
[2026-03-20 18:32:01 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 6.0%,
[2026-03-20 18:32:11 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 6.2%,
[2026-03-20 18:32:21 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 6.4%,
[2026-03-20 18:32:31 DP0] Avg thr (in/out): 0.0 / 6.4 tokens/s, API server (completed/routed/waiting): 1 / 1 / 0, Engine (running/waiting): 1 / 0, KV cache: 6.6%,

tuilakhanh · 2026-03-20T10:45:29Z

Confirmed that it can run, but there's no speed improvement compared to the current main branch. Could this be a specific issue on the Windows platform?

Try the MoE model; in my testing, I saw little improvement over the dense model.

lingyezhixing · 2026-03-20T11:22:08Z

Confirmed that it can run, but there's no speed improvement compared to the current main branch. Could this be a specific issue on the Windows platform?

Try the MoE model; in my testing, I saw little improvement over the dense model.

Indeed, MoE has achieved significant improvement

lzhangzz added 24 commits March 19, 2026 09:17

fix num of kv cached layers & moe router type

3fd04c2

fix linear attn states

c621aa2

batch GDN

7d5320e

remove unused

218b2b8

Refactor invokeRMSNormGated function to use Tensor references and sim…

aec72c0

…plify parameter handling. Update GatedDeltaNetLayer to utilize the new function signature for improved clarity and performance.

optimize recurrent gated delta rule kernel

89d044f

optimize

d7f89bf

Add chuck GDR kernel & benchmark utils

91e58c9

optimize

a8e84b7

persistent kernel

896898a

optimize v3

cefad40

better scheduling

a56bfb4

optimize

ea9afc3

update

b5017ff

optimize

44498c0

optimize

cf71a2a

optimize

b7eb123

reduce block size to 128

db4d932

optimize conv1d

373a3ad

lint

57b3312

guard stateful features

c3d5f94

fix recurrent state

60b1b3d

refactor recurrent state management

6eedea4

rename

1ef710e

lvhan028 requested review from irexyc and lvhan028 March 19, 2026 09:58

lvhan028 added the improvement label Mar 19, 2026

lzhangzz added 3 commits March 19, 2026 10:05

fix test build

a282cd4

update allocation policy

23dff44

Merge remote-tracking branch 'origin/main' into optimize-qwen3.5

e61c858

lzhangzz added 2 commits March 19, 2026 12:57

fix msvc build

f145753

minor

f6610b5

lvhan028 requested a review from Copilot March 20, 2026 03:36

Copilot started reviewing on behalf of lvhan028 March 20, 2026 03:36 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

add float state & relax conv1d conv_dim constraint

c83d2d7

lapy pushed a commit to lapy/lmdeploy that referenced this pull request Mar 20, 2026

Merge PR InternLM#4434

838289e

lvhan028 approved these changes Mar 21, 2026

View reviewed changes

lvhan028 merged commit 764f35a into InternLM:main Mar 21, 2026
9 checks passed

	# Qwen3.5 uses sigmoid MoE routing (not softmax)
	# Qwen3.5 uses softmax MoE routing

	# Qwen3.5 uses sigmoid MoE routing (not softmax)
	# Qwen3.5 MoE routing uses softmax scoring

	// conv_states: (num_linear_layers, conv_dim, d_conv) — per-channel rolling conv history
	// conv_states: (num_linear_layers, d_conv, conv_dim) — per-channel rolling conv history

Conversation

lzhangzz commented Mar 19, 2026

Bug Fixes

Kernel Optimizations

Scheduling & State Management

Uh oh!

tuilakhanh commented Mar 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

lzhangzz commented Mar 20, 2026

Uh oh!

lingyezhixing commented Mar 20, 2026

Uh oh!

lzhangzz commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tuilakhanh commented Mar 20, 2026

Uh oh!

lvhan028 commented Mar 20, 2026

Uh oh!

lingyezhixing commented Mar 20, 2026

Uh oh!

tuilakhanh commented Mar 20, 2026

Uh oh!

lingyezhixing commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lzhangzz commented Mar 20, 2026 •

edited

Loading