Skip to content

fix: Set VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT to venv dir#2149

Open
terrykong wants to merge 2 commits intomainfrom
guyueh/fix-uv-project-environment
Open

fix: Set VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT to venv dir#2149
terrykong wants to merge 2 commits intomainfrom
guyueh/fix-uv-project-environment

Conversation

@terrykong
Copy link
Collaborator

@terrykong terrykong commented Mar 25, 2026

Summary

  • VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT were incorrectly set to the python executable path (e.g. /path/to/.venv/bin/python) instead of the venv directory (e.g. /path/to/.venv)
  • Fixes this by stripping the bin/python suffix via os.path.dirname twice to get the correct venv root
  • Applied across grpo.py (async replay buffer + trajectory collector), worker_groups.py, and environments/utils.py

Test plan

  • Verify venv-based Ray workers start correctly with the fixed env vars
  • Run async GRPO training to confirm replay buffer and trajectory collector initialize properly

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Chores
    • Updated virtual environment path configuration for distributed training workers.

@terrykong terrykong requested review from a team as code owners March 25, 2026 01:05
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 25, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@terrykong terrykong requested a review from guyueh1 March 25, 2026 01:06
@terrykong terrykong added the CI:Lfast Runs a fast test suite and re-use nightly `main` container (but sync dependencies to PRs version) label Mar 25, 2026
@terrykong terrykong changed the title fix: Set VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT to venv dir, not python executable fix: Set VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT to venv dir Mar 25, 2026
…on executable

VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT should point to the virtual environment
directory (e.g. /path/to/.venv), not the python executable path
(e.g. /path/to/.venv/bin/python). Strip the "bin/python" suffix by applying
os.path.dirname twice to get the correct venv root.

Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
@terrykong terrykong force-pushed the guyueh/fix-uv-project-environment branch from 86739ff to b0c29c7 Compare March 25, 2026 01:06
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

Three modules were modified to normalize environment variable configuration for Ray actors. Instead of using full Python executable paths, the code now derives and uses virtual environment root directories by computing the parent directory of the executable path.

Changes

Cohort / File(s) Summary
Virtual environment path derivation for Ray actors
nemo_rl/algorithms/grpo.py, nemo_rl/distributed/worker_groups.py, nemo_rl/environments/utils.py
Modified VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT environment variable assignments in runtime configurations. Changed from using full Python executable paths to computing virtual environment root directories by removing the bin/python suffix from executable paths when configuring Ray actor runtime environments.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Results For Major Changes ⚠️ Warning PR description provides only test plan intentions without actual test results or evidence; implementation bugs identified in review still present in grpo.py and worker_groups.py. Fix implementation bugs by moving directory derivation logic inside conditional blocks in grpo.py and worker_groups.py; document actual test results demonstrating venv-based Ray workers start correctly.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately and concisely describes the main change: fixing how VIRTUAL_ENV and UV_PROJECT_ENVIRONMENT are set to use the venv directory instead of the executable path.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch guyueh/fix-uv-project-environment

Comment @coderabbitai help to get the list of available commands and usage tips.

@terrykong
Copy link
Collaborator Author

/ok to test b0c29c7

@terrykong terrykong enabled auto-merge (squash) March 25, 2026 01:08
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@nemo_rl/algorithms/grpo.py`:
- Around line 2476-2483: _replay_py_venv and _tc_py_venv are computed
unconditionally using dirname(dirname(<exec>)), which produces wrong venv roots
when NEMO_RL_PY_EXECUTABLES_SYSTEM=1; fix by only computing the venv root after
you decide the executable is NOT a system executable—i.e., move the
dirname(dirname(_replay_py_exec)) computation inside the branch that handles
non-system executables (same for dirname(dirname(_tc_py_exec)) inside the
AsyncTrajectoryCollector branch), and when the system-executables flag is set,
set the corresponding _*_py_venv to None or leave it out of the env mapping so
the env_vars use the system paths correctly; update references in the
_replay_runtime_env and _tc_runtime_env creation to use the conditionally-set
_replay_py_venv/_tc_py_venv.

In `@nemo_rl/distributed/worker_groups.py`:
- Around line 534-536: The py_venv computation and the assignments to
runtime_env["env_vars"]["VIRTUAL_ENV"] and
runtime_env["env_vars"]["UV_PROJECT_ENVIRONMENT"] are being done unconditionally
using two dirname calls on py_executable (producing /usr for system Python);
move the py_venv derivation and those env var assignments inside the
actor_python_env.startswith("uv") conditional (the same block that handles the
"uv" virtualenv case) so the double-dirname is only applied for uv-managed
executables and not for system sys.executable paths.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2525c16e-1844-4820-a354-ddf460ce2166

📥 Commits

Reviewing files that changed from the base of the PR and between 29f5880 and b0c29c7.

📒 Files selected for processing (3)
  • nemo_rl/algorithms/grpo.py
  • nemo_rl/distributed/worker_groups.py
  • nemo_rl/environments/utils.py

Comment on lines +2476 to +2483
_replay_py_venv = os.path.dirname(os.path.dirname(_replay_py_exec)) # to remove the "bin/python" suffix

_replay_runtime_env = {
"py_executable": _replay_py_exec,
"env_vars": {
**os.environ,
"VIRTUAL_ENV": _replay_py_exec,
"UV_PROJECT_ENVIRONMENT": _replay_py_exec,
"VIRTUAL_ENV": _replay_py_venv,
"UV_PROJECT_ENVIRONMENT": _replay_py_venv,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bug: Same unconditional double-dirname issue as worker_groups.py.

Both _replay_py_venv (line 2476) and _tc_py_venv (line 2509) are computed outside their respective if blocks. When NEMO_RL_PY_EXECUTABLES_SYSTEM=1 is set, the paths will be system Python paths, and the double-dirname will yield incorrect venv roots.

🐛 Proposed fix for ReplayBuffer section
     _replay_py_exec = get_actor_python_env(
         "nemo_rl.algorithms.async_utils.ReplayBuffer"
     )
+    _replay_py_venv = None
     if _replay_py_exec.startswith("uv"):
         # Lazily build a dedicated venv across all Ray nodes on-demand.
         _replay_py_exec = create_local_venv_on_each_node(
             _replay_py_exec,
             "nemo_rl.algorithms.async_utils.ReplayBuffer",
         )
+        _replay_py_venv = os.path.dirname(os.path.dirname(_replay_py_exec))  # to remove the "bin/python" suffix
 
-    _replay_py_venv = os.path.dirname(os.path.dirname(_replay_py_exec)) # to remove the "bin/python" suffix
-
     _replay_runtime_env = {
         "py_executable": _replay_py_exec,
         "env_vars": {
             **os.environ,
-            "VIRTUAL_ENV": _replay_py_venv,
-            "UV_PROJECT_ENVIRONMENT": _replay_py_venv,
+            **({
+                "VIRTUAL_ENV": _replay_py_venv,
+                "UV_PROJECT_ENVIRONMENT": _replay_py_venv,
+            } if _replay_py_venv else {}),
         },
     }
🐛 Proposed fix for AsyncTrajectoryCollector section

Apply the same pattern to the _tc_py_exec / _tc_py_venv section (lines 2500-2518).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_rl/algorithms/grpo.py` around lines 2476 - 2483, _replay_py_venv and
_tc_py_venv are computed unconditionally using dirname(dirname(<exec>)), which
produces wrong venv roots when NEMO_RL_PY_EXECUTABLES_SYSTEM=1; fix by only
computing the venv root after you decide the executable is NOT a system
executable—i.e., move the dirname(dirname(_replay_py_exec)) computation inside
the branch that handles non-system executables (same for
dirname(dirname(_tc_py_exec)) inside the AsyncTrajectoryCollector branch), and
when the system-executables flag is set, set the corresponding _*_py_venv to
None or leave it out of the env mapping so the env_vars use the system paths
correctly; update references in the _replay_runtime_env and _tc_runtime_env
creation to use the conditionally-set _replay_py_venv/_tc_py_venv.

Comment on lines +534 to +536
py_venv = os.path.dirname(os.path.dirname(py_executable)) # to remove the "bin/python" suffix
runtime_env["env_vars"]["VIRTUAL_ENV"] = py_venv
runtime_env["env_vars"]["UV_PROJECT_ENVIRONMENT"] = py_venv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bug: Double-dirname applied unconditionally, breaks for system Python paths.

The py_venv derivation is outside the if actor_python_env.startswith("uv") block (lines 441-451). When NEMO_RL_PY_EXECUTABLES_SYSTEM=1 is set, py_executable becomes sys.executable (e.g., /usr/bin/python3), and the double-dirname yields /usr instead of a valid venv root.

Compare with nemo_rl/environments/utils.py where the same logic is correctly placed inside the if block.

🐛 Proposed fix: Move logic inside the if block
         if actor_python_env.startswith("uv"):
             # If the py_executable begins with uv it signals that we need to create a
             #  local venv first and then replace the py_executable with the local venv's python.
             #  The directory the venv will be created in is controlled by the env var
             #  NEMO_RL_VENV_DIR and defaults to $GIT_ROOT/venvs/.
             py_executable = create_local_venv_on_each_node(
                 py_executable=actor_python_env,
                 venv_name=remote_worker_builder.ray_actor_class_fqn,
             )
+            py_venv = os.path.dirname(os.path.dirname(py_executable))  # to remove the "bin/python" suffix
         else:
             py_executable = actor_python_env
+            py_venv = None
 
         # ... (inside the worker creation loop)
 
                 runtime_env = {
                     "env_vars": worker_env_vars,
                     "py_executable": py_executable,
                 }
-                py_venv = os.path.dirname(os.path.dirname(py_executable)) # to remove the "bin/python" suffix
-                runtime_env["env_vars"]["VIRTUAL_ENV"] = py_venv
-                runtime_env["env_vars"]["UV_PROJECT_ENVIRONMENT"] = py_venv
+                if py_venv is not None:
+                    runtime_env["env_vars"]["VIRTUAL_ENV"] = py_venv
+                    runtime_env["env_vars"]["UV_PROJECT_ENVIRONMENT"] = py_venv
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nemo_rl/distributed/worker_groups.py` around lines 534 - 536, The py_venv
computation and the assignments to runtime_env["env_vars"]["VIRTUAL_ENV"] and
runtime_env["env_vars"]["UV_PROJECT_ENVIRONMENT"] are being done unconditionally
using two dirname calls on py_executable (producing /usr for system Python);
move the py_venv derivation and those env var assignments inside the
actor_python_env.startswith("uv") conditional (the same block that handles the
"uv" virtualenv case) so the double-dirname is only applied for uv-managed
executables and not for system sys.executable paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:Lfast Runs a fast test suite and re-use nightly `main` container (but sync dependencies to PRs version)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants