generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add truncation to SFT DataCollatorForLanguageModeling
#5315
opened Mar 19, 2026 by
albertvillanova
Loading…
fix: fix a bug in vLLM weight synchronization when
vllm_enable_sleep_mode=True
#5313
opened Mar 19, 2026 by
muupan
Loading…
1 of 5 tasks
Show conversations instead of decoded text in the completions table
#5309
opened Mar 19, 2026 by
qgallouedec
Loading…
Add support for logging extra columns in reward functions and update related tests
#5308
opened Mar 19, 2026 by
qgallouedec
Loading…
fix: skip ref adapter when peft config uses target_parameters
#5292
opened Mar 16, 2026 by
gambletan
Loading…
3 tasks
Add reference to DeepSeekMath in accuracy_reward docstring
#5287
opened Mar 13, 2026 by
qgallouedec
Loading…
5 tasks
Introduce backend rollout-completions interface and decouple OpenEnv helper from vLLM internals
#5256
opened Mar 10, 2026 by
rycerzes
Loading…
batch params together in weight sync and async update the weights
#5249
opened Mar 9, 2026 by
winglian
Loading…
5 tasks
Introduce minimal generation backend interface for GRPO and RLOO trainers
#5244
opened Mar 8, 2026 by
rycerzes
Loading…
feat: log raw importance ratios and fraction of truncation/masking in vLLM importance sampling correction
#5243
opened Mar 8, 2026 by
muupan
Loading…
1 of 5 tasks
Update openenv examples to use
environment_factory
#5235
opened Mar 6, 2026 by
sergiopaniego
Loading…
3 of 8 tasks
vLLM Server Sync via LoRA Adapter Reload (avoid merge + full weight sync) for GRPO
#5188
opened Feb 26, 2026 by
lfranceschetti
Loading…
feat(experimental): Divergence Proximal Policy Optimization
#5117
opened Feb 17, 2026 by
LeonEricsson
Loading…
5 tasks
Add prefix-preserving training chat template for GPT-OSS
#5109
opened Feb 17, 2026 by
qgallouedec
•
Draft
Add support for DGPO (ICLR 2026) to GRPO
#5102
opened Feb 15, 2026 by
YanqiDai
Loading…
5 tasks done
Previous Next
ProTip!
Updated in the last three days: updated:>2026-03-16.