Refactor loss ctx #1442

jayhenry · 2026-01-23T14:54:00Z

Summary

This PR refactors the loss subsystem to standardize how loss contexts are constructed, sequence-parallelism (SP) is handled, and batches are built.
Now Trainer/Worker only interact with LossConfig and LossContext objects.
It introduces unified builders across losses, removes legacy input-item types.

Highlights

Introduces BaseLossConfig.build, and BaseLossContext.build_batches. Removes *ContextInputItem, and BaseLossKwargs now include built-in to, sp_split instead.
CE loss:
- Adds CELossConfig.build and CELossContext.build_batches.
RL loss:
- Adds BaseRLLossConfig, BaseRLLossKwargs, BaseRLLossContext.
- Refactors GRPO/Oreal to the new API;
- BaseRLLossContext internalizes rollout importance-sampling handling.
Worker/Trainer flow:
- Workers and trainer now only interact with LossConfig and LossContext directly
- batching is done via LossContext.build_batches.
- compute_actor_logprobs and compute_ref_logprobs return List[torch.Tensor] for clearer data flow.
- SP handling is centralized within builders for consistency.

Compatibility

Public training CLI and typical user workflows remain unchanged; only internal wiring and APIs between workers/trainers and loss configs are updated.

Testing

Updated unit and integration tests across engines/models/losses to the new API.
Verified identical reward curves , correct SP behavior for RL

xtuner/v1/rl/base/worker.py

HIT-cwh · 2026-01-26T08:57:02Z

xtuner/xtuner/v1/rl/base/worker.py

Line 770 in 8f8bf16

def _train_one_step_sft(self, data_batch):

_train_one_step_sft 这里也需要对应修改

HIT-cwh · 2026-01-26T08:59:14Z

是不是类似 CELossContextInputItem 这样的 class 都不需要了

xtuner/v1/loss/base_loss_ctx.py

jayhenry · 2026-01-27T14:26:34Z

是不是类似 CELossContextInputItem 这样的 class 都不需要了

Yes

jayhenry requested review from HIT-cwh, YanhuiDua and hhaAndroid January 26, 2026 03:01

HIT-cwh reviewed Jan 26, 2026

View reviewed changes

xtuner/v1/rl/base/worker.py Outdated Show resolved Hide resolved

HIT-cwh reviewed Jan 26, 2026

View reviewed changes

xtuner/v1/rl/base/worker.py Outdated Show resolved Hide resolved

HIT-cwh reviewed Jan 26, 2026

View reviewed changes

xtuner/v1/loss/base_loss_ctx.py Outdated Show resolved Hide resolved

jayhenry force-pushed the refactor_loss_ctx branch 2 times, most recently from 0fcd257 to 3f97f80 Compare January 27, 2026 14:19

jayhenry added 19 commits January 28, 2026 14:08

[Refactor] simplify CELossContext build logic in Trainer

2d2444e

refactor loss ctx build logic in train worker

2a6c92f

fix ut

391a3fd

fix sp logic in LossContext for labels, rollout_log_probs etc.

b3db047

fix sp bug in TrainWorker

83605b6

refine loss_ctx in rl fit

1fb7f71

refine2 loss_ctx in rl fit

6637153

refine ce_loss_ctx in trainer

d609050

refactor ce loss in rl's train_sft

259e41e

fix CELossConfig.build bug

fad7f67

remove loss ctx input

9773b0a

refactor test_ce_loss and rm loss ctx input

5743c93

refactor test_grpo_loss and rm ctx_input

8d340c0

refactor test_oreal_loss and rm ctx input

ec5700b

refactor test_dense_train_engine

381dfb4

refactor a bunch of test cases

b142db4

fix loss_cfg.build to device

e6c4064

refactor remaining test cases

54cc043

remove loss ctx input

670299f

jayhenry added 2 commits January 28, 2026 14:09

remove duplicated sp_split in rl.utils

8038e36

fix sp test cases for test_dense_engine_train

b30755b

jayhenry force-pushed the refactor_loss_ctx branch from 50ca706 to b30755b Compare January 28, 2026 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor loss ctx #1442

Refactor loss ctx #1442

Uh oh!

jayhenry commented Jan 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

HIT-cwh commented Jan 26, 2026

Uh oh!

HIT-cwh commented Jan 26, 2026

Uh oh!

Uh oh!

jayhenry commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refactor loss ctx #1442

Are you sure you want to change the base?

Refactor loss ctx #1442

Uh oh!

Conversation

jayhenry commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Highlights

Compatibility

Testing

Uh oh!

Uh oh!

Uh oh!

HIT-cwh commented Jan 26, 2026

Uh oh!

HIT-cwh commented Jan 26, 2026

Uh oh!

Uh oh!

jayhenry commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jayhenry commented Jan 23, 2026 •

edited

Loading