Skip to content

[Feature] Introduce mujoco_playground environment wrapper.#3751

Open
itwasabhi wants to merge 4 commits into
pytorch:mainfrom
itwasabhi:mujoco_playground0
Open

[Feature] Introduce mujoco_playground environment wrapper.#3751
itwasabhi wants to merge 4 commits into
pytorch:mainfrom
itwasabhi:mujoco_playground0

Conversation

@itwasabhi
Copy link
Copy Markdown
Contributor

Introduces a torch-rl wrapper of the mujoco playground environment. Wrapper also supports multi-agent decomposition.

Motivation and Context

#3733

Types of changes

What types of changes does your code introduce? Remove all that do not apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • I have read the CONTRIBUTION guide (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented May 14, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/3751

Note: Links to docs will display an error until the docs builds have been completed.

❌ 4 New Failures, 1 Unrelated Failure

As of commit 916baf0 with merge base 83f3d50 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 14, 2026
@github-actions
Copy link
Copy Markdown
Contributor

ghost commented May 14, 2026

⚠️ PR Title Label Error

PR title must start with a label prefix in brackets (e.g., [BugFix]).

Current title: Introduce mujoco_playground environment wrapper.

Supported Prefixes (case-sensitive)

Your PR title must start with exactly one of these prefixes:

Prefix Label Applied Example
[BugFix] BugFix [BugFix] Fix memory leak in collector
[Feature] Feature [Feature] Add new optimizer
[Doc] or [Docs] Documentation [Doc] Update installation guide
[Refactor] Refactoring [Refactor] Clean up module imports
[CI] CI [CI] Fix workflow permissions
[Test] or [Tests] Tests [Tests] Add unit tests for buffer
[Environment] or [Environments] Environments [Environments] Add Gymnasium support
[Data] Data [Data] Fix replay buffer sampling
[Performance] or [Perf] Performance [Performance] Optimize tensor ops
[BC-Breaking] bc breaking [BC-Breaking] Remove deprecated API
[Deprecation] Deprecation [Deprecation] Mark old function
[Quality] Quality [Quality] Fix typos and add codespell

Note: Common variations like singular/plural are supported (e.g., [Doc] or [Docs]).

@itwasabhi itwasabhi changed the title Introduce mujoco_playground environment wrapper. [Feature] Introduce mujoco_playground environment wrapper. May 14, 2026
@vmoens vmoens force-pushed the mujoco_playground0 branch from c7a92d5 to 15de4dd Compare May 19, 2026 10:21
@itwasabhi
Copy link
Copy Markdown
Contributor Author

Thanks for the fixes @vmoens

itwasabhi and others added 4 commits May 21, 2026 22:10
Introduces a torch-rl wrapper of the mujoco playground environment.
Wrapper also supports multi-agent decomposition.
- CI: install `playground` (PyPI name) instead of import name
  `mujoco_playground`; the previous `pip install mujoco_playground` was the
  root cause of the current `unittests-mujoco-playground` CI failure.
- CI: drop the broken try/except JAX-init fallback in run_test.sh — it ran
  in a subprocess that exited immediately, so `JAX_PLATFORM_NAME` never
  reached pytest and any real "GPU not visible" error was hidden.
- Wrapper: docstrings no longer advertise a `state` field that the env
  never emits; the JAX state is intentionally kept on
  `self._current_state` rather than round-tripped through TensorDict (this
  is now documented as a `.. note::` block).
- Wrapper: freeze `MujocoPlaygroundAgentSpec` /
  `MujocoPlaygroundAgentMapping` dataclasses and deep-copy
  `KNOWN_MARL_MAPPINGS` entries on string lookup so users cannot mutate
  the module-level mapping by accident.
- Wrapper: emit a `UserWarning` when resolving a string against
  `KNOWN_MARL_MAPPINGS`, since those indices target Brax's observation
  layout and may not be semantically equivalent for mujoco_playground envs.
- Wrapper: document the policy contract for `homogenization_mode='max'`
  and `'concat'` (which action/obs entries are real vs padding/discarded).
- Wrapper: align `_MujocoPlaygroundMeta` num_workers handling with
  `_BraxMeta`; make `agent_mapping` and `config`/`config_overrides`
  keyword-only; accept `seed=None` in `_set_seed` (defaults to 0, matching
  the `_reset` fallback) instead of raising bare `Exception`; document
  `_listerize`'s inclusive-range semantics; drop unused
  `pixels_only`/`camera_id`/`render_kwargs` from `_build_env`.
- Example: rewrite `save_visualization` in
  `profile_mujoco_playground_collector.py` to snapshot
  `env._current_state` during a manual rollout, removing dependencies on
  non-existent `env._state_example` and `td["state"]`.
- Config: drop stale `categorical_action_encoding` field on
  `MujocoPlaygroundEnvConfig`; add `agent_mapping` and `num_workers` so
  the MARL and parallel-process knobs are reachable from the config
  system.
- Tests: replace three duplicated `_setup_jax` fixtures with a single
  module-level autouse fixture; introduce a session-scoped
  `marl_env_sizes` fixture to avoid constructing a throwaway env in every
  MARL test; expect the new MABrax warning on string lookups; add a
  new `TestMujocoPlaygroundDictObs` class covering reset /
  `check_env_specs` and a negative test that
  `homogenization_mode != 'none'` raises `NotImplementedError` for
  dict-obs envs; make `test_no_mapping_regression` actually exercise the
  MARL env it constructs.
- Docs: keep the `MOGymEnv` / `MOGymWrapper` pair adjacent in
  `envs_libraries.rst`; `__repr__` now uses `env_name=` to match the
  constructor kwarg.
@itwasabhi itwasabhi force-pushed the mujoco_playground0 branch from 15de4dd to 916baf0 Compare May 21, 2026 21:26
@itwasabhi
Copy link
Copy Markdown
Contributor Author

failing mujoco_playground tests should now pass.

@itwasabhi
Copy link
Copy Markdown
Contributor Author

@vmoens any other requested changes? I believe the remaining test failures are unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Documentation Improvements or additions to documentation Environments Adds or modifies an environment wrapper Examples Feature New feature Trainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants