Skip to content

[MGPU] Tests: make newton-only tests kitless (drop AppLauncher boot)#5883

Draft
hujc7 wants to merge 4 commits into
isaac-sim:developfrom
hujc7:jichuanh/kitless-newton-tests
Draft

[MGPU] Tests: make newton-only tests kitless (drop AppLauncher boot)#5883
hujc7 wants to merge 4 commits into
isaac-sim:developfrom
hujc7:jichuanh/kitless-newton-tests

Conversation

@hujc7
Copy link
Copy Markdown
Collaborator

@hujc7 hujc7 commented May 30, 2026

[MGPU] Tests: make newton-only tests kitless (drop AppLauncher boot)

Summary

  • Removes AppLauncher boot from 2 newton-only test files that already use the newton-aware fixture; they now run via SimulationContext's existing has_kit() gate without booting Kit / Isaac-Sim.
  • Replaces unguarded from omni.physx.scripts import utils in schemas.py with direct pxr.UsdPhysics.FixedJoint calls (was the bulk blocker: 95 fixed-base test failures across kitless newton tests routed through this line).
  • Replaces omni.usd.get_context().get_stage() in newton test_articulation with isaaclab.sim.utils.stage.get_current_stage for kitless compatibility.
  • Net effect for multi-GPU CI: newton test_articulation no longer boots Kit, so the upstream concurrent-Kit shutdown hang (#3475) does not apply to it. Its MULTI_GPU_SKIP_REASON marker in #5875 drops once this PR lands.

1. Problem

Newton-only tests inherited an AppLauncher(headless=True).app boot from the physx test template they were copy-pasted from (PR #4818, March 2026). Tests that drive newton through the newton-aware fixture don't actually need Kit — newton's SimulationContext path runs kitless when isaaclab.utils.version.has_kit() returns False — but the boilerplate forced Kit boot anyway, costing ~1:20 per file single-shot and exposing the suite to the Kit lifecycle SIGHUP/shutdown-hang bug under concurrent multi-GPU CI.

2. Scope

2.1 Files converted to kitless

File Lines changed Local 3-MIG result
source/isaaclab_newton/test/assets/test_articulation.py +7 -5 224 passed, 8 skipped, 1 xfailed
source/isaaclab_newton/test/assets/test_rigid_object_collection.py -8 158 passed, 8 skipped

Each file: removes AppLauncher import + simulation_app = AppLauncher(headless=True).app, replaces docstring with a rationale comment.

2.2 Files investigated but NOT converted

These were tested in kitless mode and kept Kit-booted because conversion would degrade coverage or hit deeper kitless gaps. Documented for follow-up; out of scope here.

File Why not
source/isaaclab/test/markers/test_visualization_markers.py 13 of 19 tests pass kitless; 6 ERROR at setup (ModuleNotFoundError: omni.physics) because they construct a physx-backed visualizer path through string_to_callable. Conversion costs 6 tests of coverage.
source/isaaclab_newton/test/sim/test_newton_schemas.py Fixture uses bare SimulationCfg(dt=0.1) which defaults to PhysxCfg. Adding physics=NewtonCfg(...) unblocks 9/13 tests; the remaining 5 fail on schema-routing ('NewtonMeshCollisionAPI' in ['PhysicsMeshCollisionAPI']). Needs schema-routing investigation.
source/isaaclab_contrib/test/deformable/test_deformable_object.py Fixture already uses NewtonCfg(solver_cfg=VBDSolverCfg(...)). Blocker is unguarded from omni.physx.scripts import deformableUtils at schemas.py:1213 and meshes.py:291. Making deformable bodies kitless requires reimplementing the helpers in newton/pxr.
source/isaaclab_contrib/test/deformable/test_rigid_deformable_coupling.py Same as above.

2.3 schemas.py kitless helper

source/isaaclab/isaaclab/sim/schemas/schemas.py: replaces unguarded from omni.physx.scripts import utils call (in modify_articulation_root_properties fixed-base joint creation path) with a local _create_fixed_joint_to_world() helper that uses pxr.UsdPhysics.FixedJoint directly. Was the single bulk blocker for kitless newton runs — all 95 fixed-base test failures routed through this line. Kept even though only 2 of 6 tests fully convert: the schemas.py path is universally invoked.

2.4 omni.usd → stage util

source/isaaclab_newton/test/assets/test_articulation.py: replaces import omni.usd; fix_reversed_joints(omni.usd.get_context().get_stage()) with from isaaclab.sim.utils.stage import get_current_stage; fix_reversed_joints(get_current_stage()).

3. Validation

3.1 Single-shot kitless

  • test_articulation.py on cuda:0: 224 passed, 0 failed (4:39, Kit boot count = 0).
  • Kit-booted baseline same file: 224/0 (5:59).
  • Delta: -1:20 (-22%) per file.

3.2 3-MIG concurrent (Horde L40, hardware-isolated)

  • 3 shards × 2 converted files: all PASSED, no SIGHUP, no Kit lifecycle bug.
  • test_articulation: 14:19 wall per shard under 3-way MIG contention (slower than single-shot since 3 parallel pytest processes share host CPU for warp compile).
  • test_rigid_object_collection: 2:25 wall per shard.

4. Follow-ups (not this PR)

  • Visualization markers kitless coverage: make the physx-backed visualizer path kitless-aware in isaaclab_physx.physics.physx_manager so the 6 currently-Kit-bound tests can also drop AppLauncher.
  • Schema-routing fix for test_newton_schemas: investigate why schemas.define_collision_properties applies the physx-side PhysicsMeshCollisionAPI instead of the Newton-side NewtonMeshCollisionAPI when the cfg is Newton-flavored.
  • Deformable kitless support: replace unguarded omni.physx.scripts.deformableUtils calls in schemas.py and meshes.py with kitless equivalents.

5. Why this is [MGPU]-tagged

The Kit-boot boilerplate doesn't matter for single-GPU CI (works fine, just slow). It matters for multi-GPU CI because concurrent Kit instances trigger the upstream shutdown hang on heavy tests like test_articulation. Companion to #5875 (multi-GPU CI infra), and independent of #5881 / #5886.

hujc7 added 2 commits May 30, 2026 22:39
Drop the module-level ``AppLauncher(headless=True).app`` boot in
``source/isaaclab_newton/test/assets/test_articulation.py``. The existing
``has_kit()`` gate in :class:`~isaaclab.sim.SimulationContext` carries the
test through the kitless path: newton tests run in pure-python + warp,
no Kit runtime, no Isaac Sim app process.

This sidesteps the Kit/Isaac-Sim concurrency lifecycle bug (newton SIGHUP
+ physx shutdown-hang on test_articulation under sustained 3+ concurrent
Kit instances on multi-GPU CI) by removing the Kit dependency entirely.
Also shaves ~30 s off per-file boot.

The kitless path exposed one unguarded ``omni.physx`` import in
``isaaclab/sim/schemas/schemas.py``: the fixed-base articulation root
joint creation called ``omni.physx.scripts.utils.createJoint``. Inline
it as ``_create_fixed_joint_to_world`` using :mod:`pxr.UsdPhysics` ops
directly so the same code path works with or without Kit loaded.

Scope intentionally limited to ``test_articulation.py`` + the one schemas
import. The other four newton test files
(``test_rigid_object.py``, ``test_rigid_object_collection.py``,
``test_newton_actuators_newton.py``, ``test_newton_schemas.py``) follow
the same pattern and can be converted in a follow-up once this one is
validated on CI.
Mirror test_articulation's pattern across five sibling tests that don't
need Kit: remove the AppLauncher boot at module top, replace docstring
with a rationale comment pointing back to test_articulation.

Files: test_visualization_markers, test_deformable_object,
test_rigid_deformable_coupling, test_rigid_object_collection,
test_newton_schemas. SimulationContext's existing has_kit() gate
already handles the kitless branch.

Also: re-targets the changelog fragment to schemas.py only (the only
user-visible change), demoting per-test changes to .skip in the newton
and contrib packages.
@github-actions github-actions Bot added the isaac-lab Related to Isaac Lab team label May 30, 2026
Copy link
Copy Markdown

@isaaclab-review-bot isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Review Summary

This PR makes newton-only tests "kitless" by removing the AppLauncher boot, reducing per-file test time by ~22% and avoiding Kit's SIGHUP lifecycle bug under concurrent multi-GPU CI.

Strengths

  1. Clean architectural approach - The key insight that SimulationContext already gates Kit-specific paths via has_kit() makes this change straightforward and minimally invasive.

  2. Well-documented _create_fixed_joint_to_world helper - The new helper function in schemas.py includes comprehensive docstrings explaining its purpose as a kitless equivalent of omni.physx.scripts.utils.createJoint. The logic mirrors the omni helper's "find first writable ancestor" walk for instanced/prototype prims.

  3. Consistent test file changes - All 6 test files follow the same pattern: remove AppLauncher boilerplate, add a docstring referencing test_articulation.py for rationale. This makes the codebase easier to maintain.

  4. Good changelog entry - The .rst changelog clearly explains what was fixed and why.

Suggestions & Questions

  1. Unique joint naming: The _create_fixed_joint_to_world function uses a while loop to find a unique FixedJoint name:

    while stage.GetPrimAtPath(f"{joint_base_path}/{joint_name}").IsValid():
        joint_name = f"FixedJoint_{idx:02d}"
        idx += 1

    Consider adding a reasonable upper bound (e.g., 1000 iterations) with a clear error message to prevent infinite loops in pathological cases, even if unlikely.

  2. Import placement: In test_articulation.py, the import of get_current_stage is done conditionally inside the if block:

    if any(name in usd_path for name in _REVERSED_JOINT_USD_FILES):
        from isaaclab.sim.utils.stage import get_current_stage

    This is fine for lazy loading, but if this pattern appears elsewhere or get_current_stage is used more broadly, consider moving it to module-level imports for consistency.

  3. Test coverage for the new helper: The PR mentions that kitless runs pass 224/0 tests cleanly, which is great validation. Is there (or should there be) a dedicated unit test for _create_fixed_joint_to_world that exercises edge cases like:

    • Articulation prims that are instance proxies
    • Prims at the root level (/)
    • Name collisions requiring the _01, _02 suffix logic
  4. Docstring wording: In test_articulation.py, the docstring mentions "shaves ~30s off per-file boot" but the PR body says the delta is 1:20 (-22%). Minor inconsistency worth harmonizing.

  5. Empty .skip files: The changelog .skip files (isaaclab_contrib/changelog.d/...skip, isaaclab_newton/changelog.d/...skip) are empty. Is this intentional to signal "no changelog needed for this package"? If so, a brief comment in the PR would help reviewers understand the convention.

Overall

This is a solid quality-of-life improvement for the CI pipeline. The changes are well-scoped, the helper function is correctly implemented using pxr.UsdPhysics primitives, and the validation numbers look good. The PR clearly explains the problem, scope, and non-scope.


This review was generated by an AI assistant. Please verify suggestions before implementing.


Update (commit d887cba): The new commit reverts the kitless conversion for 3 tests (test_newton_schemas, test_deformable_object, test_rigid_deformable_coupling) that need Kit because they resolve the physics backend via omni.physics. This is a sensible scope reduction — the commit message clearly explains why these files can't go kitless without switching to the newton-aware cfg pattern, which is out of scope.

The remaining 3 kitless files (test_articulation, test_rigid_object_collection, test_visualization_markers) are unchanged. No new issues introduced.

Previous inline comment status:

  • Ambiguous cross-reference nit in test_visualization_markers.py: not addressed in this commit (file untouched). Original comment stands — it's a minor nit and non-blocking.

Update (commit c2495f3): This commit reverts the kitless conversion for test_visualization_markers.py, restoring AppLauncher boot. The commit message clearly explains that 6 of 19 tests hit ModuleNotFoundError: No module named 'omni.physics' because the physx-backed visualizer path imports omni.physics.tensors. This is a sound scope reduction — keeping Kit boot for this file preserves all 19 tests of coverage.

Previous inline comment status:

  • Cross-reference nit in test_visualization_markers.py: ✅ Resolved (moot) — the kitless docstring containing the ambiguous reference has been removed entirely by this revert.

No new issues introduced. The PR now kitless-converts only test_articulation.py and test_rigid_object_collection.py, which is a well-scoped final state.


"""Rest everything follows."""
"""Kitless newton-only test: no AppLauncher boot. See test_articulation.py
for rationale (avoid Kit lifecycle SIGHUP/shutdown-hang under concurrent
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (ambiguous cross-reference): This file is in isaaclab/test/markers/, not in isaaclab_newton/. The bare test_articulation.py reference is ambiguous since there are multiple files with that name across packages.

Consider using the full path for clarity:

"""Kitless newton-only test: no AppLauncher boot. See
isaaclab_newton/test/assets/test_articulation.py for rationale ..."""

The other cross-package references (test_deformable_object.py, test_newton_schemas.py) already use the full path — this one appears to be an oversight.

hujc7 added 2 commits May 30, 2026 23:32
Three of the six newton-only tests instantiate SimulationContext with the
default SimulationCfg, which resolves the physics backend through
``string_to_callable`` and binds to ``omni.physics``. Without Kit booted,
setup raises ``ModuleNotFoundError: No module named 'omni.physics'``.

Reverting test_newton_schemas, test_deformable_object, and
test_rigid_deformable_coupling to keep their AppLauncher boot. The three
files that work with the kitless path (test_articulation,
test_rigid_object_collection, test_visualization_markers) remain
converted. Making the failing three kitless requires switching them to
the newton-aware cfg pattern test_articulation uses, which is out of
scope here.
13 of 19 tests pass without Kit booted, but 6 (test_instantiation,
test_usd_marker, test_rendering_context_authors_visible_usd_point_instancer,
test_multiple_prototypes_marker, test_visualization_skips_updates_when_invisible,
test_newton_marker_backend_registers_and_updates_state_without_frame_capture)
ERROR at setup with ``ModuleNotFoundError: No module named 'omni.physics'``.
Those tests construct a physx-backed visualizer path through
``string_to_callable``, which imports ``omni.physics.tensors`` from
``isaaclab_physx.physics.physx_manager``.

Dropping AppLauncher costs the file 6 tests of coverage. Restoring Kit
boot keeps all 19. Until those 6 paths are made kitless-aware
(physics-manager selection should bypass the physx import when newton is
the active backend), this file must keep its AppLauncher.
hujc7 added a commit to hujc7/IsaacLab that referenced this pull request May 31, 2026
Restore the MULTI_GPU_SKIP_REASON marker on the physx variant only.
Newton test_articulation drops AppLauncher entirely via PR isaac-sim#5883, so it
runs cleanly under concurrent multi-GPU. The physx variant must still
boot Kit for omni.physics; under 3-shard concurrent CI runners (shared
GPU visibility) Kit's shutdown hangs >52s, causing SIGHUP cascade across
sibling shards and "Stage already attached" errors.

Cross-linked upstream at IsaacLab isaac-sim#3475 / OMPE-43816 (deferred past
Isaac Sim 5.0 per the engineering thread).
hujc7 added a commit to hujc7/IsaacLab that referenced this pull request May 31, 2026
Bundles the kitless conversion of newton test_articulation +
test_rigid_object_collection into the dynamic-sharding branch so the
multi-GPU CI workflow actually exercises a non-Kit-booted newton
test_articulation alongside the physx skip. Will rebase away when isaac-sim#5883
lands.

Includes the universal schemas.py fix
(``_create_fixed_joint_to_world`` replaces unguarded
``omni.physx.scripts.utils.createJoint``) and the .skip changelog
fragments for the test-only packages.
@hujc7 hujc7 changed the title [Tests] Make newton-only tests kitless (drop AppLauncher boot) [MGPU] Tests: make newton-only tests kitless (drop AppLauncher boot) Jun 2, 2026
hujc7 added a commit to hujc7/IsaacLab that referenced this pull request Jun 3, 2026
hujc7 added a commit to hujc7/IsaacLab that referenced this pull request Jun 3, 2026
After dropping the cherry-pick of the kitless newton conversion to keep
this PR scoped to CI infra, the newton variant of test_articulation
once again boots Kit at module level and is subject to the same
concurrent-Kit shutdown hang / SIGHUP cascade as the physx variant.

Restore the ``MULTI_GPU_SKIP_REASON`` marker on the newton variant so
the multi-GPU discover-step filter excludes it. The marker comment
points at isaac-sim#5883, which
removes the AppLauncher boot from this file and lets the kitless
SimulationContext path carry the test. After isaac-sim#5883 lands and this PR
rebases on develop, the marker can be dropped in the same commit that
re-enables it.

Both newton and physx test_articulation are now consistently skipped
from multi-GPU; both still run in single-GPU CI.
hujc7 added a commit to hujc7/IsaacLab that referenced this pull request Jun 3, 2026
Per per-PR minimum-needed analysis:
- isaac-sim#5886 (bounded shutdown) is closed (audit verdict nice-to-have;
  isaac-sim#5933 prevents the hang upstream so the force-exit timer is moot).
  Reverts SIGHUP handler + ISAACLAB_FORCE_EXIT_TIMEOUT timer in
  AppLauncher; drops the workflow env var.
- isaac-sim#5883 (kitless newton) kept open as a separate PR but left out of
  this diagnostic bundle to test whether isaac-sim#5933 alone is enough for
  newton test_articulation (which calls
  build_simulation_context(sim_cfg=, device=) at line 2427, so still
  needs isaac-sim#5881 for the cross-device kwarg fix). Reverts the newton
  test_articulation kitless conversion and the schemas.py
  _create_fixed_joint_to_world helper.

Bundle now contains: isaac-sim#5823 + isaac-sim#5875 base + isaac-sim#5881 + isaac-sim#5933 + the JUnit
XML path-collision fix in conftest. If green, confirms only 4 PRs
are needed for multi-GPU CI green (with test_articulation un-gated).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant