[MGPU] Tests: make newton-only tests kitless (drop AppLauncher boot)#5883
[MGPU] Tests: make newton-only tests kitless (drop AppLauncher boot)#5883hujc7 wants to merge 4 commits into
Conversation
Drop the module-level ``AppLauncher(headless=True).app`` boot in ``source/isaaclab_newton/test/assets/test_articulation.py``. The existing ``has_kit()`` gate in :class:`~isaaclab.sim.SimulationContext` carries the test through the kitless path: newton tests run in pure-python + warp, no Kit runtime, no Isaac Sim app process. This sidesteps the Kit/Isaac-Sim concurrency lifecycle bug (newton SIGHUP + physx shutdown-hang on test_articulation under sustained 3+ concurrent Kit instances on multi-GPU CI) by removing the Kit dependency entirely. Also shaves ~30 s off per-file boot. The kitless path exposed one unguarded ``omni.physx`` import in ``isaaclab/sim/schemas/schemas.py``: the fixed-base articulation root joint creation called ``omni.physx.scripts.utils.createJoint``. Inline it as ``_create_fixed_joint_to_world`` using :mod:`pxr.UsdPhysics` ops directly so the same code path works with or without Kit loaded. Scope intentionally limited to ``test_articulation.py`` + the one schemas import. The other four newton test files (``test_rigid_object.py``, ``test_rigid_object_collection.py``, ``test_newton_actuators_newton.py``, ``test_newton_schemas.py``) follow the same pattern and can be converted in a follow-up once this one is validated on CI.
Mirror test_articulation's pattern across five sibling tests that don't need Kit: remove the AppLauncher boot at module top, replace docstring with a rationale comment pointing back to test_articulation. Files: test_visualization_markers, test_deformable_object, test_rigid_deformable_coupling, test_rigid_object_collection, test_newton_schemas. SimulationContext's existing has_kit() gate already handles the kitless branch. Also: re-targets the changelog fragment to schemas.py only (the only user-visible change), demoting per-test changes to .skip in the newton and contrib packages.
There was a problem hiding this comment.
AI Review Summary
This PR makes newton-only tests "kitless" by removing the AppLauncher boot, reducing per-file test time by ~22% and avoiding Kit's SIGHUP lifecycle bug under concurrent multi-GPU CI.
Strengths
-
Clean architectural approach - The key insight that
SimulationContextalready gates Kit-specific paths viahas_kit()makes this change straightforward and minimally invasive. -
Well-documented
_create_fixed_joint_to_worldhelper - The new helper function inschemas.pyincludes comprehensive docstrings explaining its purpose as a kitless equivalent ofomni.physx.scripts.utils.createJoint. The logic mirrors the omni helper's "find first writable ancestor" walk for instanced/prototype prims. -
Consistent test file changes - All 6 test files follow the same pattern: remove
AppLauncherboilerplate, add a docstring referencingtest_articulation.pyfor rationale. This makes the codebase easier to maintain. -
Good changelog entry - The
.rstchangelog clearly explains what was fixed and why.
Suggestions & Questions
-
Unique joint naming: The
_create_fixed_joint_to_worldfunction uses a while loop to find a uniqueFixedJointname:while stage.GetPrimAtPath(f"{joint_base_path}/{joint_name}").IsValid(): joint_name = f"FixedJoint_{idx:02d}" idx += 1
Consider adding a reasonable upper bound (e.g., 1000 iterations) with a clear error message to prevent infinite loops in pathological cases, even if unlikely.
-
Import placement: In
test_articulation.py, the import ofget_current_stageis done conditionally inside theifblock:if any(name in usd_path for name in _REVERSED_JOINT_USD_FILES): from isaaclab.sim.utils.stage import get_current_stage
This is fine for lazy loading, but if this pattern appears elsewhere or
get_current_stageis used more broadly, consider moving it to module-level imports for consistency. -
Test coverage for the new helper: The PR mentions that kitless runs pass 224/0 tests cleanly, which is great validation. Is there (or should there be) a dedicated unit test for
_create_fixed_joint_to_worldthat exercises edge cases like:- Articulation prims that are instance proxies
- Prims at the root level (
/) - Name collisions requiring the
_01,_02suffix logic
-
Docstring wording: In
test_articulation.py, the docstring mentions "shaves ~30s off per-file boot" but the PR body says the delta is 1:20 (-22%). Minor inconsistency worth harmonizing. -
Empty
.skipfiles: The changelog.skipfiles (isaaclab_contrib/changelog.d/...skip,isaaclab_newton/changelog.d/...skip) are empty. Is this intentional to signal "no changelog needed for this package"? If so, a brief comment in the PR would help reviewers understand the convention.
Overall
This is a solid quality-of-life improvement for the CI pipeline. The changes are well-scoped, the helper function is correctly implemented using pxr.UsdPhysics primitives, and the validation numbers look good. The PR clearly explains the problem, scope, and non-scope.
This review was generated by an AI assistant. Please verify suggestions before implementing.
Update (commit d887cba): The new commit reverts the kitless conversion for 3 tests (test_newton_schemas, test_deformable_object, test_rigid_deformable_coupling) that need Kit because they resolve the physics backend via omni.physics. This is a sensible scope reduction — the commit message clearly explains why these files can't go kitless without switching to the newton-aware cfg pattern, which is out of scope.
The remaining 3 kitless files (test_articulation, test_rigid_object_collection, test_visualization_markers) are unchanged. No new issues introduced.
Previous inline comment status:
- Ambiguous cross-reference nit in
test_visualization_markers.py: not addressed in this commit (file untouched). Original comment stands — it's a minor nit and non-blocking.
Update (commit c2495f3): This commit reverts the kitless conversion for test_visualization_markers.py, restoring AppLauncher boot. The commit message clearly explains that 6 of 19 tests hit ModuleNotFoundError: No module named 'omni.physics' because the physx-backed visualizer path imports omni.physics.tensors. This is a sound scope reduction — keeping Kit boot for this file preserves all 19 tests of coverage.
Previous inline comment status:
- Cross-reference nit in
test_visualization_markers.py: ✅ Resolved (moot) — the kitless docstring containing the ambiguous reference has been removed entirely by this revert.
No new issues introduced. The PR now kitless-converts only test_articulation.py and test_rigid_object_collection.py, which is a well-scoped final state.
|
|
||
| """Rest everything follows.""" | ||
| """Kitless newton-only test: no AppLauncher boot. See test_articulation.py | ||
| for rationale (avoid Kit lifecycle SIGHUP/shutdown-hang under concurrent |
There was a problem hiding this comment.
Nit (ambiguous cross-reference): This file is in isaaclab/test/markers/, not in isaaclab_newton/. The bare test_articulation.py reference is ambiguous since there are multiple files with that name across packages.
Consider using the full path for clarity:
"""Kitless newton-only test: no AppLauncher boot. See
isaaclab_newton/test/assets/test_articulation.py for rationale ..."""The other cross-package references (test_deformable_object.py, test_newton_schemas.py) already use the full path — this one appears to be an oversight.
Three of the six newton-only tests instantiate SimulationContext with the default SimulationCfg, which resolves the physics backend through ``string_to_callable`` and binds to ``omni.physics``. Without Kit booted, setup raises ``ModuleNotFoundError: No module named 'omni.physics'``. Reverting test_newton_schemas, test_deformable_object, and test_rigid_deformable_coupling to keep their AppLauncher boot. The three files that work with the kitless path (test_articulation, test_rigid_object_collection, test_visualization_markers) remain converted. Making the failing three kitless requires switching them to the newton-aware cfg pattern test_articulation uses, which is out of scope here.
13 of 19 tests pass without Kit booted, but 6 (test_instantiation, test_usd_marker, test_rendering_context_authors_visible_usd_point_instancer, test_multiple_prototypes_marker, test_visualization_skips_updates_when_invisible, test_newton_marker_backend_registers_and_updates_state_without_frame_capture) ERROR at setup with ``ModuleNotFoundError: No module named 'omni.physics'``. Those tests construct a physx-backed visualizer path through ``string_to_callable``, which imports ``omni.physics.tensors`` from ``isaaclab_physx.physics.physx_manager``. Dropping AppLauncher costs the file 6 tests of coverage. Restoring Kit boot keeps all 19. Until those 6 paths are made kitless-aware (physics-manager selection should bypass the physx import when newton is the active backend), this file must keep its AppLauncher.
Restore the MULTI_GPU_SKIP_REASON marker on the physx variant only. Newton test_articulation drops AppLauncher entirely via PR isaac-sim#5883, so it runs cleanly under concurrent multi-GPU. The physx variant must still boot Kit for omni.physics; under 3-shard concurrent CI runners (shared GPU visibility) Kit's shutdown hangs >52s, causing SIGHUP cascade across sibling shards and "Stage already attached" errors. Cross-linked upstream at IsaacLab isaac-sim#3475 / OMPE-43816 (deferred past Isaac Sim 5.0 per the engineering thread).
Bundles the kitless conversion of newton test_articulation + test_rigid_object_collection into the dynamic-sharding branch so the multi-GPU CI workflow actually exercises a non-Kit-booted newton test_articulation alongside the physx skip. Will rebase away when isaac-sim#5883 lands. Includes the universal schemas.py fix (``_create_fixed_joint_to_world`` replaces unguarded ``omni.physx.scripts.utils.createJoint``) and the .skip changelog fragments for the test-only packages.
… together" This reverts commit 6137b1b.
After dropping the cherry-pick of the kitless newton conversion to keep this PR scoped to CI infra, the newton variant of test_articulation once again boots Kit at module level and is subject to the same concurrent-Kit shutdown hang / SIGHUP cascade as the physx variant. Restore the ``MULTI_GPU_SKIP_REASON`` marker on the newton variant so the multi-GPU discover-step filter excludes it. The marker comment points at isaac-sim#5883, which removes the AppLauncher boot from this file and lets the kitless SimulationContext path carry the test. After isaac-sim#5883 lands and this PR rebases on develop, the marker can be dropped in the same commit that re-enables it. Both newton and physx test_articulation are now consistently skipped from multi-GPU; both still run in single-GPU CI.
Per per-PR minimum-needed analysis: - isaac-sim#5886 (bounded shutdown) is closed (audit verdict nice-to-have; isaac-sim#5933 prevents the hang upstream so the force-exit timer is moot). Reverts SIGHUP handler + ISAACLAB_FORCE_EXIT_TIMEOUT timer in AppLauncher; drops the workflow env var. - isaac-sim#5883 (kitless newton) kept open as a separate PR but left out of this diagnostic bundle to test whether isaac-sim#5933 alone is enough for newton test_articulation (which calls build_simulation_context(sim_cfg=, device=) at line 2427, so still needs isaac-sim#5881 for the cross-device kwarg fix). Reverts the newton test_articulation kitless conversion and the schemas.py _create_fixed_joint_to_world helper. Bundle now contains: isaac-sim#5823 + isaac-sim#5875 base + isaac-sim#5881 + isaac-sim#5933 + the JUnit XML path-collision fix in conftest. If green, confirms only 4 PRs are needed for multi-GPU CI green (with test_articulation un-gated).
[MGPU] Tests: make newton-only tests kitless (drop AppLauncher boot)
Summary
AppLauncherboot from 2 newton-only test files that already use the newton-aware fixture; they now run viaSimulationContext's existinghas_kit()gate without booting Kit / Isaac-Sim.from omni.physx.scripts import utilsinschemas.pywith directpxr.UsdPhysics.FixedJointcalls (was the bulk blocker: 95 fixed-base test failures across kitless newton tests routed through this line).omni.usd.get_context().get_stage()in newtontest_articulationwithisaaclab.sim.utils.stage.get_current_stagefor kitless compatibility.test_articulationno longer boots Kit, so the upstream concurrent-Kit shutdown hang (#3475) does not apply to it. ItsMULTI_GPU_SKIP_REASONmarker in #5875 drops once this PR lands.1. Problem
Newton-only tests inherited an
AppLauncher(headless=True).appboot from the physx test template they were copy-pasted from (PR #4818, March 2026). Tests that drive newton through the newton-aware fixture don't actually need Kit — newton's SimulationContext path runs kitless whenisaaclab.utils.version.has_kit()returns False — but the boilerplate forced Kit boot anyway, costing ~1:20 per file single-shot and exposing the suite to the Kit lifecycle SIGHUP/shutdown-hang bug under concurrent multi-GPU CI.2. Scope
2.1 Files converted to kitless
source/isaaclab_newton/test/assets/test_articulation.pysource/isaaclab_newton/test/assets/test_rigid_object_collection.pyEach file: removes
AppLauncherimport +simulation_app = AppLauncher(headless=True).app, replaces docstring with a rationale comment.2.2 Files investigated but NOT converted
These were tested in kitless mode and kept Kit-booted because conversion would degrade coverage or hit deeper kitless gaps. Documented for follow-up; out of scope here.
source/isaaclab/test/markers/test_visualization_markers.pyModuleNotFoundError: omni.physics) because they construct a physx-backed visualizer path throughstring_to_callable. Conversion costs 6 tests of coverage.source/isaaclab_newton/test/sim/test_newton_schemas.pySimulationCfg(dt=0.1)which defaults to PhysxCfg. Addingphysics=NewtonCfg(...)unblocks 9/13 tests; the remaining 5 fail on schema-routing ('NewtonMeshCollisionAPI' in ['PhysicsMeshCollisionAPI']). Needs schema-routing investigation.source/isaaclab_contrib/test/deformable/test_deformable_object.pyNewtonCfg(solver_cfg=VBDSolverCfg(...)). Blocker is unguardedfrom omni.physx.scripts import deformableUtilsatschemas.py:1213andmeshes.py:291. Making deformable bodies kitless requires reimplementing the helpers in newton/pxr.source/isaaclab_contrib/test/deformable/test_rigid_deformable_coupling.py2.3 schemas.py kitless helper
source/isaaclab/isaaclab/sim/schemas/schemas.py: replaces unguardedfrom omni.physx.scripts import utilscall (inmodify_articulation_root_propertiesfixed-base joint creation path) with a local_create_fixed_joint_to_world()helper that usespxr.UsdPhysics.FixedJointdirectly. Was the single bulk blocker for kitless newton runs — all 95 fixed-base test failures routed through this line. Kept even though only 2 of 6 tests fully convert: the schemas.py path is universally invoked.2.4 omni.usd → stage util
source/isaaclab_newton/test/assets/test_articulation.py: replacesimport omni.usd; fix_reversed_joints(omni.usd.get_context().get_stage())withfrom isaaclab.sim.utils.stage import get_current_stage; fix_reversed_joints(get_current_stage()).3. Validation
3.1 Single-shot kitless
test_articulation.pyon cuda:0: 224 passed, 0 failed (4:39, Kit boot count = 0).3.2 3-MIG concurrent (Horde L40, hardware-isolated)
test_articulation: 14:19 wall per shard under 3-way MIG contention (slower than single-shot since 3 parallel pytest processes share host CPU for warp compile).test_rigid_object_collection: 2:25 wall per shard.4. Follow-ups (not this PR)
isaaclab_physx.physics.physx_managerso the 6 currently-Kit-bound tests can also drop AppLauncher.test_newton_schemas: investigate whyschemas.define_collision_propertiesapplies the physx-sidePhysicsMeshCollisionAPIinstead of the Newton-sideNewtonMeshCollisionAPIwhen the cfg is Newton-flavored.omni.physx.scripts.deformableUtilscalls inschemas.pyandmeshes.pywith kitless equivalents.5. Why this is
[MGPU]-taggedThe Kit-boot boilerplate doesn't matter for single-GPU CI (works fine, just slow). It matters for multi-GPU CI because concurrent Kit instances trigger the upstream shutdown hang on heavy tests like
test_articulation. Companion to #5875 (multi-GPU CI infra), and independent of #5881 / #5886.