Skip to content

[Refactor] Unify MAPPOLoss + IPPOLoss into MultiPPOLoss with deprecat…#3788

Closed
theap06 wants to merge 1 commit into
pytorch:mainfrom
theap06:feat/multi-ppo-loss
Closed

[Refactor] Unify MAPPOLoss + IPPOLoss into MultiPPOLoss with deprecat…#3788
theap06 wants to merge 1 commit into
pytorch:mainfrom
theap06:feat/multi-ppo-loss

Conversation

@theap06
Copy link
Copy Markdown
Contributor

@theap06 theap06 commented May 23, 2026

Addresses the Proposed field in #3748

@matteobettini lmk what you think

…ion warnings

MAPPOLoss and IPPOLoss are structurally identical — the only difference is
which critic the user passes in. This PR consolidates them into a single
MultiPPOLoss class with a critic_type field ("centralized" | "independent")
that makes the intended algorithm self-documenting without changing any
gradient computation.

MAPPOLoss and IPPOLoss are retained as deprecated aliases (DeprecationWarning
on instantiation) and will be removed in v0.11 per the two-minor-release
deprecation policy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 23, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3788

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 New Failures, 2 Unrelated Failures

As of commit baeb46f with merge base 68f1ba5 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following job failed but was likely due to flakiness present on trunk:

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 23, 2026
@github-actions github-actions Bot added Objectives Refactoring Refactoring of an existing feature labels May 23, 2026
Copy link
Copy Markdown
Collaborator

@vmoens vmoens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against having the two loss modules, even if they're (almost) identical. It's clearer from an API perspective. We should not deprecate them


def __init__(self, *args: Any, **kwargs: Any) -> None:
warnings.warn(
"IPPOLoss is deprecated and will be removed in v0.11. "
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol the next release is 0.13 ;)

@theap06
Copy link
Copy Markdown
Contributor Author

theap06 commented May 24, 2026

I'm not against having the two loss modules, even if they're (almost) identical. It's clearer from an API perspective. We should not deprecate them

I'll close the PR then!

@theap06 theap06 closed this May 24, 2026
@theap06 theap06 deleted the feat/multi-ppo-loss branch May 24, 2026 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Objectives Refactoring Refactoring of an existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants