[Doc] Memory-efficient RL training tutorial + cross-refs by vmoens · Pull Request #3745 · pytorch/rl

vmoens · 2026-05-12T16:10:41Z

Summary

Ties together the three recently-merged memory-efficiency PRs into a
single story:

compact_obs collector flag ([Performance] Add compact_obs flag to DataCollector #3742)
NextStateReconstructor RB transform ([Feature] NextStateReconstructor RB transform #3743)
NaN-safe value-estimator forward ([BugFix] Sanitize NaN ('next', obs) in value-estimator forward #3744)

Two parts:

1. Runnable Sphinx-gallery tutorial at
tutorials/sphinx-tutorials/memory_efficient_rl.py. Sections:

Where the observation memory goes (concrete td.bytes() numbers)
Why ("next", obs) is kept by default — bootstrap target at
trajectory ends, MultiStepTransform n-step fallback
Knob 1 — SyncDataCollector(compact_obs=True)
Knob 2 — NextStateReconstructor with the traj_id + done contract
Knob 2.5 — value-estimator NaN safety
(_sanitize_next_obs_nan), GAE finite everywhere
When not to take this path — MultiStepTransform incompatibility,
the V(obs[t]) ≈ V(real_next_obs) approximation at truncated steps,
and how shifted=True interacts
Knob 3 — LazyMemmapStorage for buffers ≥ VRAM
Knob 4 — SliceSampler + the new "scan" / "triton"
recurrent backends for padding-free sequence training
End-to-end pipeline snippet
Conclusion + Further reading

Runs end-to-end on CPU (CartPole-v1, 200 frames; <2s wall) and reports
the byte-level savings concretely from td.bytes().

2. Docstring cross-references so a reader landing on any of the
three new APIs finds the other two:

Collector(compact_obs=…) (and the multi-process collectors):
pointers to NextStateReconstructor, the value-estimator
sanitizer, the MultiStepTransform incompatibility note, and the
new tutorial.
NextStateReconstructor: .. seealso:: block covering
compact_obs, the sanitizer, MultiStepTransform, and the
tutorial.
ValueEstimatorBase._sanitize_next_obs_nan: .. seealso::
block to compact_obs, NextStateReconstructor, and the
tutorial.

docs/source/index.rst registers the new tutorial under "Basics".

Test plan

Tutorial runs end-to-end with concrete output (memory savings
reported, NaN at slice boundaries confirmed to coincide with
trajectory boundaries, GAE advantage finite everywhere, memmap
roundtrip works).
All cross-references resolve to existing public symbols
(verified by reading the rendered class docstrings via
Collector.__doc__ etc.).

🤖 Generated with Claude Code

New tutorial under tutorials/sphinx-tutorials/memory_efficient_rl.py that ties together the three recent memory-efficiency PRs: - compact_obs flag on the collector (pytorch#3742) - NextStateReconstructor RB transform (pytorch#3743) - NaN-safe value-estimator forward (pytorch#3744) The tutorial walks through: - Where the observation memory goes and why TorchRL keeps both obs and ("next", obs) by default (bootstrap targets, MultiStep n-step fallback) - Knob 1: SyncDataCollector(compact_obs=True) — halves the obs footprint at the producer side - Knob 2: NextStateReconstructor — rebuilds ("next", obs) at sampling time, NaN at trajectory ends - Knob 2.5: ValueEstimatorBase._sanitize_next_obs_nan keeps GAE/TD targets numerically defined - When NOT to take this path: MultiStepTransform, truncated transitions where the V(obs[t]) ≈ V(real_next_obs) approximation is unacceptable - Knob 3: LazyMemmapStorage for buffers larger than VRAM - Knob 4: SliceSampler + scan/Triton recurrent backends for padding-free sequence training - End-to-end pipeline snippet The tutorial runs end-to-end on CPU (CartPole-v1, 200 frames) and reports concrete byte-level savings from `td.bytes()`. Cross-references added to: - SyncDataCollector / MultiSyncCollector / MultiAsyncCollector (`compact_obs` docstring) — pointers to NextStateReconstructor, the value-estimator sanitizer, MultiStep incompatibility note, and the new tutorial. - NextStateReconstructor — `.. seealso::` block to compact_obs, the sanitizer, MultiStep incompatibility, and the tutorial. - ValueEstimatorBase._sanitize_next_obs_nan — `.. seealso::` to compact_obs, NextStateReconstructor, and the tutorial. docs/source/index.rst — register the new tutorial under "Basics". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

pytorch-bot · 2026-05-12T16:10:46Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3745

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Run pull request jobs on OSDC runners in shadow mode

❌ 2 New Failures, 2 Pending

As of commit bf4a3f6 with merge base cc31dc3 ():

NEW FAILURES - The following jobs have failed:

Lint / python-source-and-configs / linux-job (gh)
Unit-tests on Linux / tests-cpu (3.13) / linux-job (gh)
test/objectives/test_ddpg.py::TestDDPG::test_ddpg_prioritized_weights

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 12, 2026

github-actions Bot added Documentation Improvements or additions to documentation Objectives Collectors Transforms tutorials/ Integrations/torch_geometric Integrations labels May 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Doc] Memory-efficient RL training tutorial + cross-refs#3745

[Doc] Memory-efficient RL training tutorial + cross-refs#3745
vmoens wants to merge 1 commit into
pytorch:mainfrom
vmoens:feature/memory-efficient-docs

vmoens commented May 12, 2026

Uh oh!

pytorch-bot Bot commented May 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

vmoens commented May 12, 2026

Summary

Test plan

Uh oh!

pytorch-bot Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3745

❗ 1 Active SEVs

❌ 2 New Failures, 2 Pending

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pytorch-bot Bot commented May 12, 2026 •

edited

Loading