[BugFix] Fix model loading error for 300B FP8 EP parallel test case by Echo-Nie · Pull Request #6382 · PaddlePaddle/FastDeploy

Echo-Nie · 2026-02-06T09:13:53Z

Motivation

Fix model loading error in 300B FP8 EP parallel test case: the error occurred when transposing weights (dimension mismatch due to missing weight adaptation before loading).

Modifications

Added weight_adapter function to handle weight renaming.
Replaced original weights_iterator with adapted_iterator (processed by weight_adapter) in both cache and direct loading paths.
Fixed parameter count mismatch in get_padding_offset call.

Usage or Command

model_path=ERNIE-4.5-300B-A47B-FP8-Paddle
config_yaml=yaml/eb45-8k-fp8-tp1-dp8_ep.yaml
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
ep_engine_ports="5378,5379,5380,5381,5382,5383,5384,5385"
metrics_ports="5110,5111,5112,5113,5114,5115,5116,5117"
ports="8188,8189,8190,8191,8192,8193,8194,8195"
cache_ports="9320,9321,9322,9323,9324,9325,9326,9327"
server_num=8
export FD_ENABLE_MULTI_API_SERVER=1
export KVCACHE_RDMA_NICS="mlx5_2,mlx5_3,mlx5_4,mlx5_5,mlx5_6,mlx5_7,mlx5_8,mlx5_9"
python -m fastdeploy.entrypoints.openai.multi_api_server --ports ${ports} --num-servers ${server_num} --metrics-ports ${metrics_ports} --args --model ${model_path} --cache-queue-port ${cache_ports} --engine-worker-queue-port ${ep_engine_ports} --config ${config_yaml} >server.log 2>&1 &

Accuracy Tests

Service Satrt

[INFO] Application startup complete.
[INFO] 127.0.0.1:25712 - "GET /v1/models HTTP/1.1" 200
[INFO] 127.0.0.1:25712 - "POST /v1/chat/completions HTTP/1.1" 200

Test

connecting: http://localhost:8188/v1 ...
model: /workspace/FirstBuildFD/bug2_3/bd
📝 Question: '1+1=?'
🤖 Response: The question "1 + 1 = ?" is a basic arithmetic problem. 
In standard arithmetic, when we add 1 and 1 together, the result is 2. 
So, 1 + 1 = 2.

Checklist

Add at least a tag in the PR title.
- Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
- You can add new tags based on the PR content, but the semantics must be clear.
Format your code, run pre-commit before commit.
Add unit tests. Please write the reason in this PR if no unit tests.
Provide accuracy results.
If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

codecov-commenter · 2026-02-06T13:46:22Z

Codecov Report

❌ Patch coverage is 75.00000% with 3 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (develop@2ffcb3d). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
...y/model_executor/model_loader/default_loader_v1.py	75.00%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #6382   +/-   ##
==========================================
  Coverage           ?   68.36%           
==========================================
  Files              ?      391           
  Lines              ?    52250           
  Branches           ?     8148           
==========================================
  Hits               ?    35723           
  Misses             ?    13918           
  Partials           ?     2609

Flag	Coverage Δ
GPU	`68.36% <75.00%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Echo-Nie added 3 commits February 6, 2026 08:23

fix fp8 bug

6b8bb14

fix

1ed3f0e

fix comment, cn to en

4f4590d

Echo-Nie had a problem deploying to Metax_ci February 6, 2026 09:13 — with GitHub Actions Failure

Echo-Nie had a problem deploying to Metax_ci February 6, 2026 11:46 — with GitHub Actions Error

fix ci

631eeb6

Echo-Nie force-pushed the fix_bug branch from 030c41b to 631eeb6 Compare February 6, 2026 12:11

Echo-Nie had a problem deploying to Metax_ci February 6, 2026 12:12 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BugFix] Fix model loading error for 300B FP8 EP parallel test case#6382

[BugFix] Fix model loading error for 300B FP8 EP parallel test case#6382
Echo-Nie wants to merge 4 commits intoPaddlePaddle:developfrom
Echo-Nie:fix_bug

Echo-Nie commented Feb 6, 2026

Uh oh!

codecov-commenter commented Feb 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Echo-Nie commented Feb 6, 2026

Motivation

Modifications

Usage or Command

Accuracy Tests

Checklist

Uh oh!

codecov-commenter commented Feb 6, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants