[Feature] Adds HERReplayBuffer and HindsightStrategy to torchrl.data.#3734
[Feature] Adds HERReplayBuffer and HindsightStrategy to torchrl.data.#3734theap06 wants to merge 4 commits into
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3734
Note: Links to docs will display an error until the docs builds have been completed.
|
|
| Prefix | Label Applied | Example |
|---|---|---|
[BugFix] |
BugFix | [BugFix] Fix memory leak in collector |
[Feature] |
Feature | [Feature] Add new optimizer |
[Doc] or [Docs] |
Documentation | [Doc] Update installation guide |
[Refactor] |
Refactoring | [Refactor] Clean up module imports |
[CI] |
CI | [CI] Fix workflow permissions |
[Test] or [Tests] |
Tests | [Tests] Add unit tests for buffer |
[Environment] or [Environments] |
Environments | [Environments] Add Gymnasium support |
[Data] |
Data | [Data] Fix replay buffer sampling |
[Performance] or [Perf] |
Performance | [Performance] Optimize tensor ops |
[BC-Breaking] |
bc breaking | [BC-Breaking] Remove deprecated API |
[Deprecation] |
Deprecation | [Deprecation] Mark old function |
[Quality] |
Quality | [Quality] Fix typos and add codespell |
Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).
|
@vmoens Results utilizing HER. Wrote up a quick script for DDPG |
vmoens
left a comment
There was a problem hiding this comment.
Amazing, I love this.
I left a couple of high level comments can you have a look?
vmoens
left a comment
There was a problem hiding this comment.
Good progress, left a few more comments here

Description
Describe your changes in detail.
Motivation and Context
Fixes #3713
I have raised an issue to propose this change (required for new features and bug fixes)
##Summary
HERReplayBuffer— aTensorDictReplayBuffersubclass that applies goal relabeling at sample time, turning failed goal-conditioned trajectories into useful training signal (Andrychowicz et al., NeurIPS 2017)HindsightStrategyenum with all four canonical strategies:FUTURE(recommended),FINAL,EPISODE,RANDOMtorchrl.dataandtorchrl.data.replay_buffers