Fix memory leak in ManagerBasedEnv.close#5925
Conversation
self.obs_buf and self.observation_space/action_space (plus single_*_space variants set by ManagerBasedRLEnv) were never released by close(). For image-observation envs at 720x1280x3 with num_envs=4 that retains tens of megabytes of cached observation tensors plus ~110 MB per gym.spaces.Box bounds-array set, accumulating one full env's worth on each construct/teardown cycle since gymnasium's wrapper chain (OrderEnforcing plus the make() registry) keeps the env reachable past close. Measured 440 MB host + 33 MB GPU retained per cycle on a typical three-image-camera setup; after the fix the same cycle releases the spaces (-370 MB) and obs tensors (-74 MB CUDA) at close-time, eliminating the per-cycle accumulation. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Builds Isaac-Cartpole-v0 via gym.make, runs reset + a step to populate obs_buf and exercise gym space construction, then asserts after close() that obs_buf is cleared and observation_space/action_space are nulled. Verified to fail on the pre-fix close() with AssertionError on the obs_buf assertion (the spaces assertion would also fail), and to pass with the fix applied. Parametrized over cuda:0 and cpu devices. Signed-off-by: Clemens Volk <cvolk@nvidia.com>
Greptile SummaryThis PR fixes a memory leak in
Confidence Score: 4/5The fix is narrowly scoped to the teardown path and does not alter any live-env behavior; safe to merge. Both fixes are correct and well-placed. The only gap is that the new test doesn't assert The test file would benefit from adding assertions for Important Files Changed
Sequence DiagramsequenceDiagram
participant W as Gymnasium Wrapper Chain
participant RL as ManagerBasedRLEnv
participant B as ManagerBasedEnv
W->>RL: close()
RL->>RL: del command_manager, reward_manager, termination_manager, curriculum_manager
RL->>RL: "single_observation_space = None"
RL->>RL: "single_action_space = None"
RL->>RL: "observation_space = None"
RL->>RL: "action_space = None"
Note over RL: Box bounds arrays (~110 MB ea.) released
RL->>B: super().close()
B->>B: sim.stop()
B->>B: obs_buf.clear()
Note over B: Tensor references dropped
B->>B: del viewport_camera_controller, action_manager, observation_manager, event_manager, recorder_manager, scene
B->>B: sim.clear_instance()
B->>B: "_is_closed = True"
Note over W: Wrapper chain still holds env ref, but large data is now freed
Reviews (1): Last reviewed commit: "Add regression test for ManagerBasedEnv...." | Re-trigger Greptile |
| assert env.unwrapped.action_space is not None, "precondition: action_space should be set" | ||
|
|
||
| env.close() | ||
|
|
||
| assert not env.unwrapped.obs_buf, "obs_buf was not cleared on close" | ||
| assert env.unwrapped.observation_space is None, "observation_space was not released on close" | ||
| assert env.unwrapped.action_space is None, "action_space was not released on close" |
There was a problem hiding this comment.
Missing assertions for
single_observation_space / single_action_space
The fix in ManagerBasedRLEnv.close() nulls all four space attributes (single_observation_space, single_action_space, observation_space, action_space), but the test only asserts the last two. The single_* variants are the per-environment gym.spaces.Box objects described in the PR description as holding ~110 MB of bounds arrays for image observations — they are exactly the attributes whose memory the fix aims to release. Leaving them unchecked means a regression that stops nulling single_observation_space would go undetected.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
There was a problem hiding this comment.
Code Review: Fix memory leak in ManagerBasedEnv.close
Good fix for a real and well-diagnosed issue. The PR description is thorough, the repro methodology is solid, and the fix is targeted. A few observations and suggestions below.
Findings
1. [Medium] self.extras dict not cleared in close()
File: source/isaaclab/isaaclab/envs/manager_based_env.py, line 557 area
self.extras is allocated at line 155 and accumulates log dicts with metrics each step. While the individual values are lightweight scalars/dicts (not large tensors), for consistency with the obs_buf cleanup it would be good to clear it:
self.extras.clear()This is a minor nit — the memory impact is negligible compared to obs_buf and spaces — but it makes the "release everything on close" contract more explicit.
2. [Low] Defensive guard style inconsistency
File: source/isaaclab/isaaclab/envs/manager_based_env.py, line 556
if isinstance(getattr(self, "obs_buf", None), dict):
self.obs_buf.clear()The defensive getattr + isinstance guard is wise for obs_buf (since close() could theoretically be called before __init__ fully completes), but the space nulling in manager_based_rl_env.py uses direct attribute assignment without a similar guard:
self.single_observation_space = None # could AttributeError if __init__ never ran _configure_gym_env_spacesIf close() is ever called on a partially-constructed env (e.g., if scene creation throws), these would raise AttributeError. Consider wrapping in a similar pattern:
for attr in ("single_observation_space", "single_action_space", "observation_space", "action_space"):
if hasattr(self, attr):
setattr(self, attr, None)3. [Medium] Same leak exists in DirectRLEnv.close()
File: source/isaaclab/isaaclab/envs/direct_rl_env.py, line 524
DirectRLEnv.close() (line 524) has the same pattern: it doesn't release self.obs_buf, self.observation_space, self.action_space, self.single_observation_space, or self.single_action_space. Since DirectRLEnv also inherits from gym.Env and uses gym.make(), it's subject to the same wrapper-chain retention.
This is out of scope for this PR but worth noting — a follow-up would complete the fix across the env hierarchy.
4. [Low] Test assumes cuda:0 availability
File: source/isaaclab/test/envs/test_manager_based_env_close_leak.py, line 24
@pytest.mark.parametrize("device", ["cuda:0", "cpu"])The cuda:0 parametrization will fail on CPU-only CI runners. Consider gating with:
import pytest
devices = ["cpu"]
if torch.cuda.is_available():
devices.append("cuda:0")
@pytest.mark.parametrize("device", devices)Or using pytest.param("cuda:0", marks=pytest.mark.skipif(not torch.cuda.is_available(), reason="CUDA not available")).
(This may already be handled by the test infrastructure's runner selection — but it's worth confirming.)
5. [Low] Test doesn't verify single_observation_space / single_action_space
File: source/isaaclab/test/envs/test_manager_based_env_close_leak.py, lines 46-48
The test asserts observation_space and action_space are None after close, but doesn't check the single_* variants that are also nulled in the fix. For completeness:
assert env.unwrapped.single_observation_space is None, "single_observation_space was not released on close"
assert env.unwrapped.single_action_space is None, "single_action_space was not released on close"Summary
| Severity | Count |
|---|---|
| Medium | 2 |
| Low | 3 |
The fix is correct, well-scoped, and addresses a genuine memory leak that compounds in evaluation loops. The implementation is clean. The two medium items (clearing extras and the parallel leak in DirectRLEnv) are nice-to-haves rather than blockers. Overall this is a solid contribution.
Description
ManagerBasedEnv.close()leaves two kinds of per-instance state alive after it returns. Gymnasium's wrapper chain (OrderEnforcingplus themake()registry) keeps the wrapped env reachable, so anything the env still references at close stays in memory:self.obs_buf: the dict of cached observation tensors from the laststep()/reset(). For image sensors this is tens of MB per camera.self.observation_space/self.action_space(plus thesingle_*_spacevariants onManagerBasedRLEnv). Each image-observationgym.spaces.Boxpins ~110 MB of bounds arrays (low,high,bounded_below,bounded_above) at shape(4, 720, 1280, 3).So roughly one env's worth of bounds arrays and the last observation buffer survive for the process lifetime. Evaluation loops that rebuild envs across tasks accumulate these and eventually OOM.
The fix clears
obs_bufand nulls the space attributes inclose().Found while debugging an evaluation-sweep OOM in a downstream project. No associated issue.
Fixes # (no associated issue)
Testing
Regression test at
source/isaaclab/test/envs/test_manager_based_env_close_leak.pybuildsIsaac-Cartpole-v0viagym.make, runs reset and one step, callsclose(), then assertsobs_bufis empty and the space attributes areNone. Parametrised overcuda:0andcpu. It fails on the pre-fixclose()and passes with the fix.Type of change
Checklist
pre-commitchecks with./isaaclab.sh --formatconfig/extension.tomlfile (happy to add a fragment if maintainers want a patch-bump entry)CONTRIBUTORS.md(happy to add if maintainers want)