【TI-Consisent】Added Metric logits_stats to the ZMQ branch by liuruyan · Pull Request #6979 · PaddlePaddle/FastDeploy

liuruyan · 2026-03-23T13:31:19Z

Motivation

背景：出于对训推一致性丰富检测指标及长期CI/CE监控考量，对sample之后的logits添加logits_stats(min/max/mean/std)，用来保证确定性及稳定性。

Modifications

数据结构及接口：由于logprob与logits_stat同样是输出的重要检测指标且均从logits计算而来，暂时实现先将logits_stat存入LogprobsTensors数据结构保存，并升级logprob传播链路上相关接口，支持同时透传logits_stats

class LogprobsTensors(NamedTuple):
    """ """

    # [num_reqs, max_num_logprobs + 1]
    logprob_token_ids: paddle.Tensor
    # [num_reqs, max_num_logprobs + 1]
    logprobs: paddle.Tensor
    # [num_reqs]
    selected_token_ranks: paddle.Tensor
    # Logits statistics for each sequence (optional)
    logits_min: Optional[paddle.Tensor] = None  # [num_reqs]
    logits_max: Optional[paddle.Tensor] = None  # [num_reqs]
    logits_mean: Optional[paddle.Tensor] = None  # [num_reqs]
    logits_std: Optional[paddle.Tensor] = None
    ...

FLAG：添加与enable_logprob同级别model_config：self.compute_logits_stats = False，且在server启动时支持配置--compute-logits-stats

注：由于改变了返回字段，导致有些单测无法通过，所以改动单测文件，返回值中提出新增字段（logits_stats ）

Usage or Command

本功能暂时只支持ZMQ，流式与非流式测试均可正常返回
启动FD服务时需要同时开启--compute-logits-stats,--enable-logprob

export FD_USE_GET_SAVE_OUTPUT_V1=1 
python -m fastdeploy.entrypoints.openai.api_server \
       --enable-logprob \
       --compute-logits-stats \
       ...  # more setting

发送请求时需要指定logprobs=True,top_logprobs=0

response = client.chat.completions.create(
    model="null",
    messages=[
        {"role": "system", "content": "I'm a helpful AI assistant."},
        {"role": "user", "content": "把李白的静夜思改写为现代诗"},
    ],
    stream=True,  # False
    max_tokens=100,
    logprobs=True,
    top_logprobs=0
)

Accuracy Tests

本PR不涉及精度修改

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

…nce consistency

paddle-bot · 2026-03-23T13:31:29Z

Thanks for your contribution!

…into logit_stat_dev

codecov-commenter · 2026-03-24T12:26:21Z

Codecov Report

❌ Patch coverage is 67.27273% with 18 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@6f5aa88). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
fastdeploy/entrypoints/openai/serving_chat.py	51.72%	11 Missing and 3 partials ⚠️
fastdeploy/output/token_processor.py	42.85%	4 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #6979   +/-   ##
==========================================
  Coverage           ?   73.84%           
==========================================
  Files              ?      399           
  Lines              ?    56093           
  Branches           ?     8853           
==========================================
  Hits               ?    41421           
  Misses             ?    11743           
  Partials           ?     2929

Flag	Coverage Δ
GPU	`73.84% <67.27%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…into logit_stat_dev

Copilot

Pull request overview

该 PR 旨在在 ZMQ 分支的 logprobs 输出链路中新增 logits_stats（min/max/mean/std）指标，用于训推一致性与稳定性监控，并通过新增开关 compute_logits_stats/--compute-logits-stats 控制是否输出。

Changes:

新增 compute_logits_stats 配置与 CLI 参数，并在 engine→worker 启动参数中透传。
扩展 LogprobsTensors/LogprobsLists 以携带 logits 统计信息，并在 OpenAI chat logprobs 响应中输出到 LogProbEntry.logits_stats。
更新相关单测/E2E/CE 用例以适配新增字段（部分用例通过剥离 logits_stats 保持断言稳定）。

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
tests/worker/test_gpu_prompt_logprobs.py	适配 `gather_logprobs` 返回结构变化（从 tuple 改为 NamedTuple）。
tests/output/test_token_processor.py	测试配置补齐 `compute_logits_stats` 字段。
tests/output/test_process_batch_output.py	适配 `top_logprobs` 扩展后的字段长度预期。
tests/e2e/4cards_cases/test_ernie_21b_tp1_dp4_mtp.py	E2E 通过递归剥离 `logits_stats` 保持对比稳定。
tests/e2e/4cards_cases/test_ernie_21b_tp1_dp4.py	同上，新增剥离工具函数并在断言前处理。
tests/ce/server/test_logprobs.py	CE 用例新增剥离 `logits_stats` 以兼容新增返回字段。
fastdeploy/worker/xpu_model_runner.py	适配 `gather_logprobs` 新返回结构的字段访问方式。
fastdeploy/worker/metax_model_runner.py	同上。
fastdeploy/worker/gpu_model_runner.py	同上。
fastdeploy/worker/worker_process.py	worker 侧新增 `--compute_logits_stats` 参数。
fastdeploy/worker/output.py	扩展 Logprobs* 结构以承载 logits 统计。
fastdeploy/output/token_processor.py	ZMQ 输出处理链路中提取并填充 `outputs.logits_stats`。
fastdeploy/entrypoints/openai/serving_completion.py	prompt_logprobs 解包逻辑适配新增字段。
fastdeploy/entrypoints/openai/serving_chat.py	chat logprobs 构建逻辑支持 logits_stats 并透传到协议层。
fastdeploy/entrypoints/openai/protocol.py	OpenAI 协议结构 `LogProbEntry` 新增 `logits_stats` 字段。
fastdeploy/entrypoints/llm.py	prompt_logprobs 解包逻辑适配新增字段。
fastdeploy/engine/request.py	CompletionOutput 增加 `logits_stats` 并在序列化/打印中包含。
fastdeploy/engine/engine.py	启动 worker 时透传 `compute_logits_stats` 开关。
fastdeploy/engine/args_utils.py	engine 侧新增 `--compute-logits-stats` 参数。
fastdeploy/config.py	ModelConfig 增加 `compute_logits_stats` 字段。

Copilot · 2026-03-25T06:12:56Z

fastdeploy/entrypoints/openai/serving_chat.py

    def _build_logprobs_response(
        self,
        request_logprobs: bool,
        response_logprobs: Optional[LogprobsLists],
        request_top_logprobs: int,
        request_decode_flag: bool,
+        logits_stats: Optional[dict[str, float]] = None,
    ) -> Optional[LogProbs]:


_build_logprobs_response 的类型注解使用了 dict[str, float]，但本文件未启用 from __future__ import annotations，在 Python 3.7/3.8（setup.py 标注支持 >=3.7）会在导入时报 TypeError: 'type' object is not subscriptable。建议改为 Optional[Dict[str, float]]（并从 typing 引入 Dict），或在文件顶部添加 from __future__ import annotations 后再统一使用内置泛型。

Copilot · 2026-03-25T06:12:56Z

fastdeploy/worker/output.py

    logprobs: list[list[float]]
    # [num_reqs]
    sampled_token_ranks: list[int]
+    # Logits statistics for each sequence (optional)
+    logits_min: Optional[list[float]] = None  # [num_reqs]
+    logits_max: Optional[list[float]] = None  # [num_reqs]
+    logits_mean: Optional[list[float]] = None  # [num_reqs]
+    logits_std: Optional[list[float]] = None  # [num_reqs]


本文件未启用 from __future__ import annotations，但新增的 Optional[list[float]] / list[list[int]] 等内置泛型注解在 Python 3.7/3.8 下会导致导入时异常；同时 setup.py 仍声明 python_requires=">=3.7"。建议：1）在文件顶部增加 from __future__ import annotations；或 2）把这些新增注解改为 Optional[List[float]] 等 typing 形式并补充导入，以保持与声明的 Python 版本兼容。

Copilot · 2026-03-25T06:12:56Z

fastdeploy/worker/output.py

        return LogprobsLists(
            [row[start:end] for row in self.logprob_token_ids],
            [row[start:end] for row in self.logprobs],
            self.sampled_token_ranks,  # unchanged
+            # [row[start:end] for row in self.logits_min],
+            # [row[start:end] for row in self.logits_max],
+            # [row[start:end] for row in self.logits_mean],
+            # [row[start:end] for row in self.logits_std],
+            self.logits_min,  # unchanged
+            self.logits_max,  # unchanged
+            self.logits_mean,  # unchanged
+            self.logits_std,  # unchanged


slice_columns 里保留了被注释掉的 logits_* 切片代码（62-65 行），当前实现又选择“unchanged”透传这些字段，容易让人误解哪些字段需要随列切片。建议删除注释代码并在 docstring/注释里明确 logits_* 的维度语义（是否按 position/token 对齐，还是按 request 对齐）。

Copilot · 2026-03-25T06:12:56Z

fastdeploy/output/token_processor.py

+                                assert (
+                                    logprobs_list.logits_min is not None
+                                ), "logits_min is None when compute_logits_stats is enabled"
+                                assert (
+                                    logprobs_list.logits_max is not None
+                                ), "logits_max is None when compute_logits_stats is enabled"
+                                assert (
+                                    logprobs_list.logits_mean is not None
+                                ), "logits_mean is None when compute_logits_stats is enabled"
+                                assert (
+                                    logprobs_list.logits_std is not None
+                                ), "logits_std is None when compute_logits_stats is enabled"


这里用 assert ... is not None 来保证 logits_* 存在：
1）在 python -O 下 assert 会被跳过，可能导致后续 float(None) 等异常；
2）assert 触发后会被外层 except 吞掉，只打 warning，最终静默缺失 logits_stats，与 --compute-logits-stats 的预期不一致。
建议改成显式的条件判断：若缺字段则记录更明确的错误并决定是否直接报错/降级关闭 logits_stats 输出。

Suggested change

assert (

logprobs_list.logits_min is not None

), "logits_min is None when compute_logits_stats is enabled"

assert (

logprobs_list.logits_max is not None

), "logits_max is None when compute_logits_stats is enabled"

assert (

logprobs_list.logits_mean is not None

), "logits_mean is None when compute_logits_stats is enabled"

assert (

logprobs_list.logits_std is not None

), "logits_std is None when compute_logits_stats is enabled"

missing_fields = []

if logprobs_list.logits_min is None:

missing_fields.append("logits_min")

if logprobs_list.logits_max is None:

missing_fields.append("logits_max")

if logprobs_list.logits_mean is None:

missing_fields.append("logits_mean")

if logprobs_list.logits_std is None:

missing_fields.append("logits_std")

if missing_fields:

# When compute_logits_stats is enabled, all logits_* fields must be present

raise ValueError(

"Missing logits stats fields when compute_logits_stats is enabled: "

+ ", ".join(missing_fields)

)

Copilot · 2026-03-25T06:12:57Z

tests/e2e/4cards_cases/test_ernie_21b_tp1_dp4.py

+def _strip_logits_stats(obj):
+    """Recursively remove 'logits_stats' keys from logprobs response."""
+    if isinstance(obj, dict):
+        obj.pop("logits_stats", None)
+        for v in obj.values():
+            _strip_logits_stats(v)
+    elif isinstance(obj, list):
+        for item in obj:
+            _strip_logits_stats(item)
+


_strip_logits_stats 在多个测试文件中以相同实现重复出现（该文件与另外的 e2e/ce 用例都新增了一份）。建议抽到 tests 的公共工具模块（例如 tests/e2e/utils 或 tests/ce/server/core）并复用，减少后续字段变更时需要同步修改的点。

Copilot · 2026-03-25T06:12:57Z

fastdeploy/engine/args_utils.py

            default=EngineArgs.enable_logprob,
            help="Enable output of token-level log probabilities.",
        )
+        model_group.add_argument(
+            "--compute-logits-stats",
+            action="store_true",
+            default=EngineArgs.compute_logits_stats,
+            help="Enable per-token logits statistics (min/max/mean/std) output.",
+        )


PR 标题目前为“【TI-Consisent】...”，不符合仓库要求的 [CLASS]Title 格式（模板里给出的 tag 列表如 [Feature] / [BugFix] 等）。建议将标题改为类似 [Feature] Add logits_stats metric for ZMQ logprobs，并修正 Consisent 的拼写以便后续检索与自动化流程识别。

ckl117

需要补充下单测，过覆盖率

ckl117 · 2026-03-25T07:52:34Z

fastdeploy/engine/engine.py

            "use_internode_ll_two_stage": self.cfg.parallel_config.use_internode_ll_two_stage,
            "disable_sequence_parallel_moe": self.cfg.parallel_config.disable_sequence_parallel_moe,
            "enable_logprob": self.cfg.model_config.enable_logprob,
+            "compute_logits_stats": self.cfg.model_config.compute_logits_stats,


common_engine.py中也得加这个参数

ckl117 · 2026-03-25T07:54:46Z

fastdeploy/worker/output.py

+    logits_min: Optional[list[float]] = None  # [num_reqs]
+    logits_max: Optional[list[float]] = None  # [num_reqs]
+    logits_mean: Optional[list[float]] = None  # [num_reqs]
+    logits_std: Optional[list[float]] = None  # [num_reqs]


PR里没有这些参数的计算逻辑？

Added Metric logits_stats to the ZMQ branch to ensure training-infere…

86c539b

…nce consistency

liuruyan temporarily deployed to Metax_ci March 23, 2026 13:31 — with GitHub Actions Inactive

paddle-bot bot added the contributor External developers label Mar 23, 2026

fix ci

44d4367

liuruyan had a problem deploying to Metax_ci March 24, 2026 05:26 — with GitHub Actions Error

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

e045231

…into logit_stat_dev

liuruyan temporarily deployed to Metax_ci March 24, 2026 05:27 — with GitHub Actions Inactive

fix ci

a8259f6

liuruyan temporarily deployed to Metax_ci March 24, 2026 10:04 — with GitHub Actions Inactive

liuruyan added 2 commits March 24, 2026 21:13

Merge branch 'develop' of https://github.com/PaddlePaddle/FastDeploy …

9abc1e1

…into logit_stat_dev

fix ci ut

4488f97

liuruyan temporarily deployed to Metax_ci March 24, 2026 13:13 — with GitHub Actions Inactive

fix ci

67e0aa1

liuruyan temporarily deployed to Metax_ci March 25, 2026 04:56 — with GitHub Actions Inactive

juncaipeng requested a review from Copilot March 25, 2026 06:07

Copilot started reviewing on behalf of juncaipeng March 25, 2026 06:07 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

ckl117 reviewed Mar 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【TI-Consisent】Added Metric logits_stats to the ZMQ branch#6979

【TI-Consisent】Added Metric logits_stats to the ZMQ branch#6979
liuruyan wants to merge 7 commits intoPaddlePaddle:developfrom
liuruyan:logit_stat_dev

liuruyan commented Mar 23, 2026 •

edited

Loading

Uh oh!

paddle-bot bot commented Mar 23, 2026

Uh oh!

codecov-commenter commented Mar 24, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

Copilot AI Mar 25, 2026

Uh oh!

ckl117 left a comment •

edited

Loading

Uh oh!

ckl117 Mar 25, 2026

Uh oh!

ckl117 Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

-                                assert (
-                                    logprobs_list.logits_min is not None
-                                ), "logits_min is None when compute_logits_stats is enabled"
-                                assert (
-                                    logprobs_list.logits_max is not None
-                                ), "logits_max is None when compute_logits_stats is enabled"
-                                assert (
-                                    logprobs_list.logits_mean is not None
-                                ), "logits_mean is None when compute_logits_stats is enabled"
-                                assert (
-                                    logprobs_list.logits_std is not None
-                                ), "logits_std is None when compute_logits_stats is enabled"
+                                missing_fields = []
+                                if logprobs_list.logits_min is None:
+                                    missing_fields.append("logits_min")
+                                if logprobs_list.logits_max is None:
+                                    missing_fields.append("logits_max")
+                                if logprobs_list.logits_mean is None:
+                                    missing_fields.append("logits_mean")
+                                if logprobs_list.logits_std is None:
+                                    missing_fields.append("logits_std")
+                                if missing_fields:
+                                    # When compute_logits_stats is enabled, all logits_* fields must be present
+                                    raise ValueError(
+                                        "Missing logits stats fields when compute_logits_stats is enabled: "
+                                        + ", ".join(missing_fields)
+                                    )

Conversation

liuruyan commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

paddle-bot bot commented Mar 23, 2026

Uh oh!

codecov-commenter commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

ckl117 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ckl117 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

ckl117 Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

liuruyan commented Mar 23, 2026 •

edited

Loading

codecov-commenter commented Mar 24, 2026 •

edited

Loading

ckl117 left a comment •

edited

Loading