refactor(cli): reduce onboard shell implementation by cv · Pull Request #3919 · NVIDIA/NemoClaw

cv · 2026-05-20T18:38:54Z

Summary

Continue the #3802 onboarding FSM cleanup by moving low-risk helper groups out of src/lib/onboard.ts into focused onboard modules. This keeps the live onboarding behavior unchanged while making src/lib/onboard.ts more clearly a CLI shell/orchestration layer.

Related Issue

Refs #3802
Stacked on #3883

Changes

Extracted Model Router lifecycle/profile helpers into src/lib/onboard/model-router.ts.
Extracted dashboard forwarding, gateway-token download, and dashboard printing helpers into src/lib/onboard/dashboard.ts.
Extracted runtime boundary glue, session-update normalization, sandbox-agent/name helpers, messaging config helpers, resume conflict checks, OpenShell version helpers, known-hosts pruning, and gateway reuse helpers into focused modules under src/lib/onboard/.
Reduced src/lib/onboard.ts by more than 1,300 lines while preserving existing exports and orchestration wiring.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Verification

npx prek run --all-files passes
npm test passes
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
make docs builds without warnings (doc changes only)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Signed-off-by: Carlos Villela cvillela@nvidia.com

copy-pr-bot · 2026-05-20T18:38:58Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-05-20T18:39:12Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 85bee941-d9ec-4c6a-9025-55eb6f90f4ab

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch refactor/3802-15-model-router-module

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ort, function or class' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

github-actions · 2026-05-20T18:43:17Z

E2E Advisor Recommendation

Required E2E: cloud-e2e, double-onboard-e2e, onboard-resume-e2e, onboard-negative-paths-e2e, messaging-providers-e2e, credential-migration-e2e, inference-routing-e2e, hermes-e2e, model-router-provider-routed-inference-e2e
Optional E2E: cloud-onboard-e2e, brave-search-e2e, gpu-e2e, openclaw-inference-switch-e2e, macos-e2e, wsl-e2e

Dispatch hint: cloud-e2e,double-onboard-e2e,onboard-resume-e2e,onboard-negative-paths-e2e,messaging-providers-e2e,credential-migration-e2e,inference-routing-e2e,hermes-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/refactor/3802-14-final-transitions
Head: HEAD
Confidence: high

Required E2E

cloud-e2e (high; live cloud inference and Docker/OpenShell sandbox, timeout 45 minutes): Required high-signal coverage for the complete OpenClaw user journey after the onboarding refactor: install, non-interactive onboard, sandbox creation, gateway route configuration, dashboard/sandbox health, and live NVIDIA Endpoint inference.
double-onboard-e2e (high; multiple sequential sandbox creations, timeout 90 minutes): Required because sandbox/gateway reuse, sandbox lifecycle, registry reconciliation, stale state repair, and repeated onboarding behavior were refactored into new modules.
onboard-resume-e2e (medium; Docker/OpenShell sandbox with live credential, timeout 60 minutes): Required because resume-config, session-updates, sandbox reuse, and lifecycle helpers changed. This validates interrupted onboard state can resume without corrupting gateway/sandbox/credential state.
onboard-negative-paths-e2e (medium-high; several non-interactive edge cases, timeout 75 minutes): Required because prompt helpers, provider recovery, validation recovery, remediation, requireValue, and non-interactive validation paths were moved. This catches regressions in friendly failures and policy/provider/model validation behavior.
messaging-providers-e2e (high; full sandbox plus messaging provider chain, timeout 75 minutes): Required because messaging config and messaging credential rotation/reuse code changed. This validates provider creation, placeholder rewriting, channel config, and credential isolation for Telegram/Discord/Slack/WeChat/WhatsApp flows.
credential-migration-e2e (medium; credential-state focused, timeout 30 minutes): Required because onboarding still contains credential migration state and the PR touches credential-adjacent modules. This validates legacy credential migration, secure cleanup, allowlisting, and symlink-safe deletion.
inference-routing-e2e (medium; PR-safe cases plus optional real-provider checks, timeout 30 minutes): Required because inference selection validation and model-router/provider recovery paths were refactored. This validates provider route setup, credential isolation, invalid-key classification, transport classification, and compatible endpoint behavior.
hermes-e2e (high; live cloud inference and Hermes sandbox, timeout 60 minutes): Required because agent selection, Hermes auth, Hermes managed tools, sandbox agent, and OpenClaw setup modules changed. This validates the Hermes install/onboard/health/live-inference user flow.
model-router-provider-routed-inference-e2e (medium-high; routed provider sandbox with live NVIDIA key, timeout 45 minutes): Required because src/lib/onboard/model-router.ts is newly extracted and wired into onboard. This regression test specifically validates provider-routed Model Router onboarding produces a working inference.local route instead of a post-onboard 503.

Optional E2E

cloud-onboard-e2e (high; public install path and live cloud sandbox, timeout 45 minutes): Optional but useful because the public installer plus onboard flow also verifies Landlock/read-only enforcement, API key leak checks, inference.local HTTPS, and policy presets after broad onboard rewiring.
brave-search-e2e (medium-high; requires BRAVE_API_KEY and NVIDIA_API_KEY, timeout 45 minutes): Optional because web-search-flow changed. Run if Brave Search secrets are available to validate credential validation, network policy preset application, no secret leakage, and in-sandbox web search.
gpu-e2e (high; GPU runner and Ollama model pull, timeout 30 minutes): Optional because sandbox-gpu-preflight changed. Run on a GPU runner if the PR needs confidence in local Ollama/GPU onboarding and sandbox GPU preflight behavior.
openclaw-inference-switch-e2e (high; sandbox plus route switch validation, timeout 45 minutes): Optional adjacent coverage for inference route persistence and OpenClaw runtime config after provider/model routing helpers changed.
macos-e2e (high; macOS runner with Docker availability gate, timeout 30 minutes): Optional cross-platform confidence because OpenShell CLI resolution, version checks, dashboard/gateway lifecycle, and path/runtime helpers can behave differently on macOS.
wsl-e2e (very high; Windows/WSL setup and Docker availability gate, timeout 90 minutes): Optional cross-platform confidence because onboarding, OpenShell install/resolution, Docker gateway lifecycle, and path handling can differ inside WSL.

New E2E recommendations

onboarding-module-extraction-parity (medium): Existing E2E covers many happy/negative paths, but there is no focused scenario that exercises interactive navigation/back/exit choices across the newly extracted prompt-helper, provider-recovery, validation-recovery, and agent-selection modules.
- Suggested test: Add a hermetic expect-style onboarding scenario that drives interactive back/exit/retry navigation through provider selection, model validation recovery, Hermes auth choice, and web-search prompts without requiring real external services.
hermes-auth-method-matrix (medium): Hermes E2E validates the Hermes user flow, but the extracted hermes-auth module introduces distinct OAuth vs Nous API-key selection paths that are not clearly covered as a matrix.
- Suggested test: Add Hermes onboarding E2E or scenario-suite coverage for NEMOCLAW_HERMES_AUTH_METHOD=oauth and api-key, including invalid method failure and API-key staging without credential leakage.

Dispatch hint

Workflow: nightly-e2e.yaml
jobs input: cloud-e2e,double-onboard-e2e,onboard-resume-e2e,onboard-negative-paths-e2e,messaging-providers-e2e,credential-migration-e2e,inference-routing-e2e,hermes-e2e

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

-  }
-  return null;
-}
+const { readLiveInference, readRecordedProvider, readRecordedNimContainer, readRecordedModel } =


+  openshellArgv,
+  runOpenshell,
+  runCaptureOpenshell,
+  safeOpenShellArgument,


-  return Boolean(fetchGatewayAuthTokenFromSandbox(sandboxName));
-}
+const {
+  sandboxExistsInGateway,


wscurran · 2026-05-21T14:41:38Z

✨

Related open PRs:

#3883 refactor(cli): route final machine transitions

Related open issues:

#3802 Umbrella: refactor onboarding into a serializable FSM

cv added 7 commits May 20, 2026 11:05

refactor(cli): extract onboard shell helpers

c90747b

refactor(cli): extract sandbox agent helpers

ce1a645

refactor(cli): extract messaging config helpers

7a07d8c

refactor(cli): extract resume conflict helpers

9d92891

refactor(cli): extract openshell version helpers

df8a52e

refactor(cli): extract known hosts pruning

fcb3e36

refactor(cli): extract gateway reuse helpers

cdd19fd

cv self-assigned this May 20, 2026

github-advanced-security AI found potential problems May 20, 2026

View reviewed changes

Comment thread src/lib/onboard.ts Fixed

Comment thread src/lib/onboard.ts Fixed

Comment thread src/lib/onboard.ts Fixed

Potential fix for pull request finding 'CodeQL / Unused variable, imp…

9ba83f7

…ort, function or class' Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

cv and others added 9 commits May 20, 2026 11:43

Apply suggestions from code review

0201b4d

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

refactor(cli): extract sandbox reuse helpers

8d74472

refactor(cli): extract messaging credential helpers

c47e5b8

refactor(cli): extract sandbox registry metadata helpers

6668902

refactor(cli): extract openclaw setup helper

a485eec

refactor(cli): extract sandbox name prompt

46039e1

refactor(cli): move telegram mention helper

3f6e041

refactor(cli): extract onboard base image helpers

534f0d8

refactor(cli): extract prompt helpers

cd29f01

cv added the v0.0.47 Release target label May 20, 2026

refactor(cli): extract sandbox gpu preflight helpers

3fe2205

github-actions Bot mentioned this pull request May 20, 2026

feat(onboard): show managed vLLM by default on DGX Spark and Station #3921

Merged

12 tasks

cv added 5 commits May 20, 2026 14:06

refactor(cli): extract remediation helpers

5545222

refactor(cli): extract provider recovery helpers

b0734c5

refactor(cli): move Hermes tool gateway normalization

5afd680

refactor(cli): move affirmative prompt helper

ae593a8

refactor(cli): extract sandbox lifecycle helpers

3b270a0

github-advanced-security AI found potential problems May 20, 2026

View reviewed changes

Comment thread src/lib/onboard.ts

}

return null;

}

const { readLiveInference, readRecordedProvider, readRecordedNimContainer, readRecordedModel } =

refactor(cli): extract openshell CLI helpers

e5503b4

github-advanced-security AI found potential problems May 20, 2026

View reviewed changes

Comment thread src/lib/onboard.ts

openshellArgv,

runOpenshell,

runCaptureOpenshell,

safeOpenShellArgument,

Comment thread src/lib/onboard.ts

return Boolean(fetchGatewayAuthTokenFromSandbox(sandboxName));

}

const {

sandboxExistsInGateway,

github-actions Bot mentioned this pull request May 20, 2026

fix(onboard): refresh provider state on agent changes #3857

Merged

cv added 6 commits May 20, 2026 17:41

refactor(cli): move prompt navigation helpers

bceab13

refactor(cli): extract Hermes auth method helpers

973ff5b

refactor(cli): extract Hermes auth flow helpers

48e8053

refactor(cli): extract onboard agent selection

b217981

refactor(cli): extract require value helper

9e442d1

refactor(cli): move onboard step banner helper

58f38f7

This was referenced May 21, 2026

fix(onboard): reject host.docker.internal inference URLs #3804

Open

fix(onboard): early-validate NEMOCLAW_POLICY_TIER before preflight (#3741) #3788

Open

refactor(cli): extract validation recovery prompts

7b2d0de

github-actions Bot mentioned this pull request May 21, 2026

fix(onboard): detect Windows-host Ollama via process probe #3969

Merged

12 tasks

cv added v0.0.49 Release target and removed v0.0.47 Release target labels May 21, 2026

This was referenced May 21, 2026

fix(onboard): fail fast in preflight when all dashboard ports are occupied (#3953) #3980

Open

fix(onboard): suppress 'No active forward found' from best-effort forward stop #3997

Merged

wscurran added NemoClaw CLI Use this label to identify issues with the NemoClaw command-line interface (CLI). refactor This is a refactor of the code and/or architecture. labels May 21, 2026

github-actions Bot mentioned this pull request May 21, 2026

fix(onboard): use NVIDIA runtime for Jetson sandbox GPU #4008

Open

refactor(cli): remove duplicate onboard sleep helper

eef5254

cv added v0.0.50 Release target and removed v0.0.49 Release target labels May 21, 2026

cv added 3 commits May 21, 2026 18:57

refactor(cli): extract web search flow helpers

5bfa612

refactor(cli): extract inference selection validation

826d82a

refactor(cli): move direct sandbox gpu verifier

a7fe203

cv mentioned this pull request May 22, 2026

Umbrella: refactor onboarding into a serializable FSM #3802

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(cli): reduce onboard shell implementation#3919

refactor(cli): reduce onboard shell implementation#3919
cv wants to merge 35 commits into
refactor/3802-14-final-transitionsfrom
refactor/3802-15-model-router-module

cv commented May 20, 2026

Uh oh!

copy-pr-bot Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 •

edited

Loading

Review skipped

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 20, 2026 •

edited

Loading

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

wscurran commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cv commented May 20, 2026

Summary

Related Issue

Changes

Type of Change

Verification

Uh oh!

copy-pr-bot Bot commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Dispatch hint

Uh oh!

wscurran commented May 21, 2026

✨

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coderabbitai Bot commented May 20, 2026 •

edited

Loading

github-actions Bot commented May 20, 2026 •

edited

Loading