Skip to content

[BugFix]fix handle 4 return values from noaux_tc_redundant op#6384

Open
mattheliu wants to merge 3 commits intoPaddlePaddle:developfrom
mattheliu:fix/noaux_tc_redundant_return_values
Open

[BugFix]fix handle 4 return values from noaux_tc_redundant op#6384
mattheliu wants to merge 3 commits intoPaddlePaddle:developfrom
mattheliu:fix/noaux_tc_redundant_return_values

Conversation

@mattheliu
Copy link
Collaborator

@mattheliu mattheliu commented Feb 6, 2026

Motivation

修复 noaux_tc_redundant 算子返回值不一致导致的测试失败。

问题根因

  • PR [Feature] Support noaux for eplb #5143 添加 noaux_tc_redundant 时,PD_BUILD_STATIC_OP 定义了 4 个输出,但 CUDA 函数只返回 3 个值
  • PR [Fix] Fix noaux ep test #5161 为"修复测试"将 Python 代码从期望 4 个返回值改为 3 个,掩盖了这个 bug
  • 现在 Paddle 框架版本变化导致行为不一致,CI 测试失败

Modifications

  1. custom_ops/gpu_ops/noaux_tc_redundant.cu:修改函数返回值从 3 个改为 4 个,与 OP 定义一致

    // 修改前
    return {scores, topk_values, topk_indices};
    
    // 修改后
    return {scores, topk_values, topk_indices, tokens_per_expert_stats_list};
  2. fastdeploy/model_executor/layers/moe/moe.py:恢复 PR [Feature] Support noaux for eplb #5143 原始设计,接收 4 个返回值

    scores, topk_values, topk_idx, _ = noaux_tc_redundant(...)

Usage or Command

# 运行测试
pytest tests/operators/test_noaux_tc_redundant.py -v

Accuracy Tests

使用现有测试 tests/operators/test_noaux_tc_redundant.py 验证:

  • 测试用例覆盖 glm45-air (128 experts) 和 deepseek (256 experts) 配置
  • 验证 topk_values 和 topk_ids 与 native 实现一致

Checklist

  • Add at least a tag in the PR title.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

The noaux_tc_redundant CUDA op is defined with 4 outputs in PD_BUILD_STATIC_OP:
- output_tensor (scores)
- topk_values
- topk_indices
- tokens_per_expert_stats_list_out (inplace updated)

The Python code was only unpacking 3 values, causing:
  ValueError: too many values to unpack (expected 3)

This fix correctly unpacks all 4 return values, ignoring the inplace
updated tensor which is the same as the input tokens_per_expert_stats_list.

Co-Authored-By: Claude (Claude Opus 4.5) <noreply@anthropic.com>
@paddle-bot
Copy link

paddle-bot bot commented Feb 6, 2026

Thanks for your contribution!

@mattheliu mattheliu changed the title fix: handle 4 return values from noaux_tc_redundant op [BugFix][OP] Fix noaux_tc_redundant return values unpacking error Feb 6, 2026
@mattheliu mattheliu changed the title [BugFix][OP] Fix noaux_tc_redundant return values unpacking error [BugFix]fix handle 4 return values from noaux_tc_redundant op Feb 6, 2026
@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@d6b3c72). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #6384   +/-   ##
==========================================
  Coverage           ?   67.57%           
==========================================
  Files              ?      391           
  Lines              ?    52240           
  Branches           ?     8145           
==========================================
  Hits               ?    35303           
  Misses             ?    14350           
  Partials           ?     2587           
Flag Coverage Δ
GPU 67.57% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

The PD_BUILD_STATIC_OP defines 4 outputs but the function only returned 3,
causing inconsistent behavior across different Paddle framework versions.

This fix explicitly returns 4 values:
- scores (inplace modified)
- topk_values
- topk_indices
- tokens_per_expert_stats_list (inplace modified via atomicAdd)

Co-Authored-By: Claude (Claude Opus 4.5) <noreply@anthropic.com>
@mattheliu mattheliu force-pushed the fix/noaux_tc_redundant_return_values branch from c8c20c7 to 6535bd1 Compare February 6, 2026 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants