Skip to content

ci: nightly job for inference test using Ollama on GPU runners#839

Merged
jayavenkatesh19 merged 1 commit intoNVIDIA:mainfrom
jayavenkatesh19:ci/nightly-gpu-e2e
Mar 25, 2026
Merged

ci: nightly job for inference test using Ollama on GPU runners#839
jayavenkatesh19 merged 1 commit intoNVIDIA:mainfrom
jayavenkatesh19:ci/nightly-gpu-e2e

Conversation

@jayavenkatesh19
Copy link
Copy Markdown
Contributor

@jayavenkatesh19 jayavenkatesh19 commented Mar 24, 2026

Summary

Add a gpu-e2e job to the nightly E2E workflow that tests the Ollama local inference path on a GPU-enabled self-hosted runner. Renamed the existing job from full-e2e to cloud-e2e to clearly distinguish the two inference paths.

What it tests

The test follows the real user flow documented in the README:

  1. Install Ollama binary
  2. Run install.sh --non-interactive with NEMOCLAW_PROVIDER=ollama
  3. Onboard starts Ollama (OLLAMA_HOST=0.0.0.0:11434), pulls model, creates sandbox
  4. Verify inference: direct Ollama + sandbox → OpenShell gateway → host.openshell.internal:11434 → Ollama on GPU
  5. Destroy sandbox + uninstall with --delete-models

Design

  • cloud-e2e and gpu-e2e run in parallel within nightly-e2e.yaml
  • gpu-e2e is gated by the GPU_E2E_ENABLED repository variable and silently skipped when unset
  • Does not require NVIDIA_API_KEY as this tests local-only inference
  • Handles reasoning models (nemotron-3-nano) that return output in reasoning field

Files

File Description
.github/workflows/nightly-e2e.yaml Rename full-e2ecloud-e2e, add gpu-e2e job with repo variable gate
test/e2e/test-gpu-e2e.sh GPU E2E test script (Ollama local inference lifecycle)

Enabling the GPU job

After merging, a repo admin needs to:

  1. Register a self-hosted runner with self-hosted label and an NVIDIA GPU (Instructions documented on an internal site)
  2. Set repository variable GPU_E2E_ENABLED to true (Settings → Secrets and variables → Actions → Variables)

Testing

I tested this out by creating my own self-hosted runner with a L40s GPU. I used the nemotron-3-nano model, but based on the choice of GPU for this operation, the nemotron-3-nano model can be further downsized to a smaller model (such as qwen2.5:0.5b). I tested out all of the following:

  • Ollama installs and starts via onboard (OLLAMA_HOST=0.0.0.0:11434)
  • nemotron-3-nano:30b model pulls and responds to inference
  • Sandbox inference routes through OpenShell gateway to local Ollama
  • Destroy and uninstall --delete-models clean up correctly
  • Job skips gracefully when GPU_E2E_ENABLED is not set

Summary by CodeRabbit

  • Tests
    • Nightly E2E split into cloud and GPU jobs; cloud job preserved and renamed for clarity.
    • New conditional GPU job runs on self-hosted runners, verifies GPU availability, and uploads separate install/test failure artifacts.
    • Added a comprehensive GPU end-to-end test that installs prerequisites, validates sandbox and local GPU inference provider, runs inference checks, and performs cleanup.

@jayavenkatesh19 jayavenkatesh19 requested a review from cv March 24, 2026 23:20
@jayavenkatesh19 jayavenkatesh19 self-assigned this Mar 24, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Split nightly E2E into cloud and conditional self-hosted GPU jobs; renamed cloud job, added a gated gpu-e2e job, and introduced a new GPU E2E script that installs NemoClaw, verifies Ollama/GPU, performs inference checks, and tears down resources.

Changes

Cohort / File(s) Summary
Workflow
\.github/workflows/nightly-e2e.yaml
Renamed jobs.full-e2ejobs.cloud-e2e, updated step name, scoped NVIDIA_API_KEY to cloud jobs, and added a new conditional jobs.gpu-e2e (runs on self-hosted, gated by github.repository == 'NVIDIA/NemoClaw' && vars.GPU_E2E_ENABLED == 'true'). GPU job verifies GPU (nvidia-smi, docker info), runs GPU test script, and uploads gpu-e2e-install-log and gpu-e2e-test-log on failure.
GPU E2E Script
test/e2e/test-gpu-e2e.sh
Added new executable Bash script: strict mode, logging to /tmp/nemoclaw-gpu-e2e-test.log, cleanup traps, prereq checks (Docker, NVIDIA GPU, ports, env), installs/stops Ollama if needed, runs install.sh --non-interactive (writes /tmp/nemoclaw-gpu-e2e-install.log), validates sandbox/gateway and Ollama, performs two "PONG" inference checks (direct Ollama HTTP and in-sandbox HTTPS), teardown/uninstall, and exits nonzero on failures.

Sequence Diagram(s)

sequenceDiagram
  participant GH as GitHub Actions
  participant Runner as Self-hosted GPU Runner
  participant Docker as Docker Engine
  participant Ollama as Ollama Service
  participant NemoClaw as NemoClaw CLI
  participant Openshell as Openshell Gateway

  GH->>Runner: trigger gpu-e2e (conditional)
  Runner->>Docker: verify Docker & GPU (nvidia-smi, docker info)
  Runner->>Ollama: install/stop/start Ollama
  Runner->>NemoClaw: run install.sh --non-interactive
  NemoClaw->>Openshell: create sandbox & gateway
  Runner->>Ollama: POST /v1/chat/completions -> expect "PONG"
  Runner->>Openshell: ssh to sandbox -> HTTPS /v1/chat/completions -> expect "PONG"
  Runner->>NemoClaw: destroy sandbox
  Runner->>Openshell: destroy gateway
  Runner->>GH: upload gpu-e2e logs on failure
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped into CI with a curious grin,

sandboxes hummed and Ollama did spin,
GPUs blinked, tests shouted "PONG" with glee,
logs curled up neat beneath the old tree,
I bounded off happy — green as can be.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding a nightly GPU E2E CI job for Ollama inference testing, which aligns with the core objectives of adding gpu-e2e job and renaming full-e2e to cloud-e2e.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@cv cv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to go after lint fixes

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 104-110: The GitHub Actions step "Upload Ollama log on failure"
currently uploads artifact name "gpu-e2e-ollama-log" from path
"/tmp/ollama-serve.log" which is never created by the test script or onboarding
flow; either remove that upload step from .github/workflows/nightly-e2e.yaml
(the entire step with name "Upload Ollama log on failure") or modify the E2E
setup so the Ollama process logs are captured to /tmp/ollama-serve.log (or
change the artifact path) by updating test/e2e/test-gpu-e2e.sh or the onboarding
flow to redirect Ollama stdout/stderr into that file before tests run.

In `@test/e2e/test-gpu-e2e.sh`:
- Around line 88-92: The workflow expects an Ollama log artifact named
"gpu-e2e-ollama-log" but this script (variables SANDBOX_NAME, TEST_LOG,
INSTALL_LOG and the global redirection via exec > >(tee -a "$TEST_LOG") 2>&1)
never creates that Ollama-specific log file, so either remove the
gpu-e2e-ollama-log artifact from the workflow or add an Ollama log capture here:
introduce an OLLAMA_LOG variable and ensure the onboarding/start command for
Ollama redirects its stdout/stderr into that file (or tee it alongside TEST_LOG)
so the artifact exists for upload.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5e0f0c20-8e86-4c53-a646-652e2421ea12

📥 Commits

Reviewing files that changed from the base of the PR and between bb8ba78 and 8a56688.

📒 Files selected for processing (2)
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-gpu-e2e.sh

Comment thread .github/workflows/nightly-e2e.yaml Outdated
Comment thread test/e2e/test-gpu-e2e.sh
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 61-64: The gpu-e2e job currently uses a generic runs-on:
self-hosted which can land on any self-hosted runner; update the gpu-e2e job
(and the other GPU job block referenced at lines 74-83) to use an explicit label
array such as runs-on: [self-hosted, linux, x64, gpu] (or your organization’s
equivalent labels) so the scheduler only selects runners with Linux + NVIDIA GPU
capabilities; locate the gpu-e2e job definition and replace the scalar runs-on
value with the label array to enforce the required runner properties.

In `@test/e2e/test-gpu-e2e.sh`:
- Around line 274-282: The test currently infers the model from the first entry
in the /api/tags response (CONFIGURED_MODEL) which can pick up leftover models;
instead require the job to set NEMOCLAW_MODEL and validate that tag exists in
/api/tags before proceeding: read NEMOCLAW_MODEL (fall back only to a
conservative no-op), call the same curl+python check to search the models list
for an entry whose 'name' equals "$NEMOCLAW_MODEL", fail the test (with an
error/info message) if not found, and only then use that exact tag for
subsequent inference requests (update references to CONFIGURED_MODEL to this
validated value).
- Around line 23-31: The script currently allows running with only
NEMOCLAW_NON_INTERACTIVE set and can silently use the default provider; enforce
the Ollama/GPU path by ensuring NEMOCLAW_PROVIDER is set to "ollama" (or
validating it) early in the script: in the initialization block that reads
environment variables (refer to NEMOCLAW_NON_INTERACTIVE and the later provider
handling around the onboarding/uninstall logic), either set
NEMOCLAW_PROVIDER="ollama" when absent or exit with a clear error if
NEMOCLAW_PROVIDER is not "ollama"; update any code that later branches on
NEMOCLAW_PROVIDER so the script cannot proceed under a non-ollama provider when
run standalone.
- Around line 193-218: The script exits early on install failure and never
performs best-effort teardown, leaving sandboxes/models/gateways behind; add a
cleanup function (e.g., cleanup) that performs best-effort teardown of
sandboxes, gateways, and temporary artifacts and register it with trap 'cleanup'
EXIT before launching install.sh so it runs on any exit; ensure cleanup can
safely run even if partial setup didn't complete and make it tolerant (ignore
errors), and reference existing symbols like install_pid and tail_pid to
kill/cleanup background processes within that cleanup routine.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 74782911-b3b5-4f1a-81cc-d3ccda8c3ee0

📥 Commits

Reviewing files that changed from the base of the PR and between 8a56688 and 6a1a55f.

📒 Files selected for processing (2)
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-gpu-e2e.sh

Comment on lines +61 to +64
gpu-e2e:
if: github.repository == 'NVIDIA/NemoClaw' && vars.GPU_E2E_ENABLED == 'true'
runs-on: self-hosted
timeout-minutes: 60
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pin this job to a GPU runner label set.

runs-on: self-hosted can match any self-hosted runner. This job assumes Linux plus an NVIDIA GPU, so add explicit labels such as [self-hosted, linux, x64, gpu] or an equivalent runner group to make the scheduler enforce those capabilities instead of failing at runtime in the GPU verification step. GitHub's docs recommend routing self-hosted jobs with labels, and a label array queues the job only on runners that match all specified labels. (docs.github.com)

🎯 Suggested change
-    runs-on: self-hosted
+    runs-on: [self-hosted, linux, x64, gpu]

Also applies to: 74-83

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/nightly-e2e.yaml around lines 61 - 64, The gpu-e2e job
currently uses a generic runs-on: self-hosted which can land on any self-hosted
runner; update the gpu-e2e job (and the other GPU job block referenced at lines
74-83) to use an explicit label array such as runs-on: [self-hosted, linux, x64,
gpu] (or your organization’s equivalent labels) so the scheduler only selects
runners with Linux + NVIDIA GPU capabilities; locate the gpu-e2e job definition
and replace the scalar runs-on value with the label array to enforce the
required runner properties.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have the exact runner configuration yet, so added a comment explaining what needs to be changed when the runners are online.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayavenkatesh19, understood — the inline comment already documents the intent clearly:

# Runner labels: using 'self-hosted' for now. Refine to
# [self-hosted, linux, x64, gpu] once NVIDIA runner labels are confirmed.

That's a reasonable interim approach. Would you like me to open a follow-up GitHub issue to track pinning the runner labels once the self-hosted GPU runner registration is finalized, so it doesn't get lost?

Comment thread test/e2e/test-gpu-e2e.sh
Comment thread test/e2e/test-gpu-e2e.sh
Comment thread test/e2e/test-gpu-e2e.sh Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
test/e2e/test-gpu-e2e.sh (1)

309-317: ⚠️ Potential issue | 🟡 Minor

Fallback to first model still risks using leftover models on reused runners.

While the NEMOCLAW_MODEL verification is good, the fallback to m[0] (first model from /api/tags) can still pick up leftover models from previous runs on reused runners. Consider logging a warning when falling back, or verifying the auto-selected model matches expected patterns.

💡 Suggested improvement
 if [ -z "$CONFIGURED_MODEL" ]; then
   CONFIGURED_MODEL=$(curl -sf http://localhost:11434/api/tags 2>/dev/null \
     | python3 -c "import json,sys; m=json.load(sys.stdin).get('models',[]); print(m[0]['name'] if m else '')" 2>/dev/null || echo "")
   if [ -n "$CONFIGURED_MODEL" ]; then
     info "Auto-detected Ollama model: $CONFIGURED_MODEL"
+    info "WARNING: Model auto-detection may pick up leftover models on reused runners."
+    info "Consider setting NEMOCLAW_MODEL explicitly in the workflow."
   else
     fail "No models found in Ollama"
   fi
 fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-gpu-e2e.sh` around lines 309 - 317, The fallback that sets
CONFIGURED_MODEL from the first model (the Python snippet that prints
m[0]['name']) can pick up leftover models on reused runners; update the block
around CONFIGURED_MODEL so that when the auto-detect path uses m[0] you emit a
clear warning via info (or warn) stating this is a fallback, and validate the
selected name against expected patterns or the NEMOCLAW_MODEL convention (e.g.,
required prefix/suffix or regex); if it doesn't match the expected pattern, fail
instead of silently accepting it (use the existing info/fail helpers and the
CONFIGURED_MODEL variable and the Python/curl auto-detect logic).
🧹 Nitpick comments (2)
test/e2e/test-gpu-e2e.sh (2)

175-175: Piping curl to shell is a known pattern but carries inherent risk.

This is the official Ollama installation method, but piping remote scripts directly to sh is inherently risky. Since this is a controlled CI environment and follows the official Ollama docs, it's acceptable, but consider adding a comment noting this is the official installation method.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-gpu-e2e.sh` at line 175, Add a brief inline comment above the
curl pipeline line that clarifies this is the official Ollama installation
method and its use here is intentional for the controlled CI environment;
reference the specific invocation "curl -fsSL https://ollama.com/install.sh |
sh" so reviewers understand this is an approved, documented installation step
and not an accidental unsafe pattern.

33-33: Consider adding set -e to fail fast on errors.

The script uses set -uo pipefail but omits -e. While explicit error handling exists at critical points, commands between checks (e.g., lines 207-210, 225-232) could silently fail. Adding -e or using explicit || exit 1 on all critical commands would be more defensive.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/e2e/test-gpu-e2e.sh` at line 33, The script currently uses the line "set
-uo pipefail" which omits the -e option; update that line to "set -euo pipefail"
so the script fails fast on any command error, or alternatively add explicit "||
exit 1" to the critical commands around the GPU sanity and deployment checks
(the commands between the existing checks referenced in the review) so failures
cannot be silently ignored; locate and modify the "set -uo pipefail" line or
append "|| exit 1" to the critical commands to implement this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@test/e2e/test-gpu-e2e.sh`:
- Around line 309-317: The fallback that sets CONFIGURED_MODEL from the first
model (the Python snippet that prints m[0]['name']) can pick up leftover models
on reused runners; update the block around CONFIGURED_MODEL so that when the
auto-detect path uses m[0] you emit a clear warning via info (or warn) stating
this is a fallback, and validate the selected name against expected patterns or
the NEMOCLAW_MODEL convention (e.g., required prefix/suffix or regex); if it
doesn't match the expected pattern, fail instead of silently accepting it (use
the existing info/fail helpers and the CONFIGURED_MODEL variable and the
Python/curl auto-detect logic).

---

Nitpick comments:
In `@test/e2e/test-gpu-e2e.sh`:
- Line 175: Add a brief inline comment above the curl pipeline line that
clarifies this is the official Ollama installation method and its use here is
intentional for the controlled CI environment; reference the specific invocation
"curl -fsSL https://ollama.com/install.sh | sh" so reviewers understand this is
an approved, documented installation step and not an accidental unsafe pattern.
- Line 33: The script currently uses the line "set -uo pipefail" which omits the
-e option; update that line to "set -euo pipefail" so the script fails fast on
any command error, or alternatively add explicit "|| exit 1" to the critical
commands around the GPU sanity and deployment checks (the commands between the
existing checks referenced in the review) so failures cannot be silently
ignored; locate and modify the "set -uo pipefail" line or append "|| exit 1" to
the critical commands to implement this change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fd613d86-bd1a-4a3e-9fe8-eada3179c3a2

📥 Commits

Reviewing files that changed from the base of the PR and between 9344e18 and f783d10.

📒 Files selected for processing (2)
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-gpu-e2e.sh

@jayavenkatesh19 jayavenkatesh19 added enhancement New feature or request CI/CD Use this label to identify issues with NemoClaw CI/CD pipeline or GitHub Actions. Local Models Running NemoClaw with local models labels Mar 25, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/nightly-e2e.yaml:
- Around line 97-101: Remove the unused environment variable
NEMOCLAW_RECREATE_SANDBOX from the GPU E2E job definition: locate the job that
sets env variables including NEMOCLAW_NON_INTERACTIVE, NEMOCLAW_SANDBOX_NAME,
NEMOCLAW_RECREATE_SANDBOX and NEMOCLAW_PROVIDER and delete the
NEMOCLAW_RECREATE_SANDBOX line so only the actually consumed variables
(NEMOCLAW_NON_INTERACTIVE, NEMOCLAW_SANDBOX_NAME, NEMOCLAW_PROVIDER, and
NEMOCLAW_MODEL where applicable) remain.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f156d7e4-9cbe-4a19-b7ee-301ae45ab1b8

📥 Commits

Reviewing files that changed from the base of the PR and between f783d10 and 5cd438f.

📒 Files selected for processing (2)
  • .github/workflows/nightly-e2e.yaml
  • test/e2e/test-gpu-e2e.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • test/e2e/test-gpu-e2e.sh

Comment on lines +97 to +101
env:
NEMOCLAW_NON_INTERACTIVE: "1"
NEMOCLAW_SANDBOX_NAME: "e2e-gpu-ollama"
NEMOCLAW_RECREATE_SANDBOX: "1"
NEMOCLAW_PROVIDER: "ollama"
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify NEMOCLAW_RECREATE_SANDBOX is not used in test-gpu-e2e.sh

echo "=== Searching for NEMOCLAW_RECREATE_SANDBOX usage in test-gpu-e2e.sh ==="
rg -n 'NEMOCLAW_RECREATE_SANDBOX' test/e2e/test-gpu-e2e.sh || echo "No usage found"

echo ""
echo "=== All env vars actually read by the script ==="
rg -n '\$\{?NEMOCLAW_' test/e2e/test-gpu-e2e.sh

Repository: NVIDIA/NemoClaw

Length of output: 607


Remove the unused NEMOCLAW_RECREATE_SANDBOX environment variable from the GPU E2E job.

The test script only consumes NEMOCLAW_SANDBOX_NAME, NEMOCLAW_PROVIDER, NEMOCLAW_NON_INTERACTIVE, and NEMOCLAW_MODEL. Although NEMOCLAW_RECREATE_SANDBOX is documented in the script's header comments, it is never read or used in the actual code, making it dead configuration that misleads future maintainers.

Diff
     env:
       NEMOCLAW_NON_INTERACTIVE: "1"
       NEMOCLAW_SANDBOX_NAME: "e2e-gpu-ollama"
-      NEMOCLAW_RECREATE_SANDBOX: "1"
       NEMOCLAW_PROVIDER: "ollama"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/nightly-e2e.yaml around lines 97 - 101, Remove the unused
environment variable NEMOCLAW_RECREATE_SANDBOX from the GPU E2E job definition:
locate the job that sets env variables including NEMOCLAW_NON_INTERACTIVE,
NEMOCLAW_SANDBOX_NAME, NEMOCLAW_RECREATE_SANDBOX and NEMOCLAW_PROVIDER and
delete the NEMOCLAW_RECREATE_SANDBOX line so only the actually consumed
variables (NEMOCLAW_NON_INTERACTIVE, NEMOCLAW_SANDBOX_NAME, NEMOCLAW_PROVIDER,
and NEMOCLAW_MODEL where applicable) remain.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used in install.sh --non-interactive, and is consumed directly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

Let me verify this by checking install.sh directly.


🏁 Script executed:

#!/bin/bash
echo "=== Searching for NEMOCLAW_RECREATE_SANDBOX in install.sh ==="
rg -n 'NEMOCLAW_RECREATE_SANDBOX' install.sh || echo "No usage found in install.sh"

echo ""
echo "=== Searching across all shell scripts ==="
rg -rn 'NEMOCLAW_RECREATE_SANDBOX' --include='*.sh' -l

Length of output: 627


🏁 Script executed:

#!/bin/bash
echo "=== All occurrences of NEMOCLAW_RECREATE_SANDBOX in install.sh with context ==="
rg -n 'NEMOCLAW_RECREATE_SANDBOX' install.sh -A2 -B2

echo ""
echo "=== Searching across all shell scripts (fd + rg) ==="
fd -e sh -x rg -l 'NEMOCLAW_RECREATE_SANDBOX' {}

Length of output: 689


🏁 Script executed:

#!/bin/bash
echo "=== install.sh - actual variable dereference (not just printf/docs) ==="
rg -n '\$\{?NEMOCLAW_RECREATE_SANDBOX' install.sh -A2 -B2

echo ""
echo "=== test-full-e2e.sh ==="
rg -n 'NEMOCLAW_RECREATE_SANDBOX' test/e2e/test-full-e2e.sh -A2 -B2

echo ""
echo "=== test-double-onboard.sh ==="
rg -n 'NEMOCLAW_RECREATE_SANDBOX' test/e2e/test-double-onboard.sh -A2 -B2

echo ""
echo "=== test-gpu-e2e.sh ==="
rg -n 'NEMOCLAW_RECREATE_SANDBOX' test/e2e/test-gpu-e2e.sh -A2 -B2

echo ""
echo "=== test-port8080-conflict.sh ==="
rg -n 'NEMOCLAW_RECREATE_SANDBOX' test/e2e/e2e-cloud-experimental/test-port8080-conflict.sh -A2 -B2

echo ""
echo "=== Full install.sh search for any dereference ==="
grep -n 'NEMOCLAW_RECREATE_SANDBOX' install.sh

Length of output: 1837


@jayavenkatesh19, you're right — I stand corrected. The search confirms that install.sh explicitly documents NEMOCLAW_RECREATE_SANDBOX=1 in its help/usage text (line 134), meaning it is a supported environment variable that install.sh passes through to the nemoclaw onboard invocation it orchestrates internally. The variable is legitimately consumed via install.sh --non-interactive → nemoclaw onboard, so setting it in the job's env: block is correct. My earlier comment was wrong to flag it as dead configuration. No change needed here.


✏️ Learnings added
Learnt from: jayavenkatesh19
Repo: NVIDIA/NemoClaw PR: 839
File: .github/workflows/nightly-e2e.yaml:97-101
Timestamp: 2026-03-25T17:41:58.209Z
Learning: In NVIDIA/NemoClaw, `NEMOCLAW_RECREATE_SANDBOX=1` is a legitimate environment variable supported by `install.sh --non-interactive`. It is documented in install.sh's help/usage printf output and is consumed indirectly: install.sh orchestrates `nemoclaw onboard --non-interactive`, which reads the variable from the environment. Setting it in a GitHub Actions `env:` block for a job that calls install.sh is correct and intentional.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@jayavenkatesh19 jayavenkatesh19 merged commit b789cc2 into NVIDIA:main Mar 25, 2026
9 checks passed
lakamsani pushed a commit to lakamsani/NemoClaw that referenced this pull request Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD Use this label to identify issues with NemoClaw CI/CD pipeline or GitHub Actions. enhancement New feature or request Local Models Running NemoClaw with local models

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants