Skip to content

[benchmarking] Error if uv cannot be used for dumping env#1425

Open
rlratzel wants to merge 9 commits intoNVIDIA-NeMo:mainfrom
rlratzel:26.02-benchmark_bugfixes2
Open

[benchmarking] Error if uv cannot be used for dumping env#1425
rlratzel wants to merge 9 commits intoNVIDIA-NeMo:mainfrom
rlratzel:26.02-benchmark_bugfixes2

Conversation

@rlratzel
Copy link
Copy Markdown
Contributor

@rlratzel rlratzel commented Jan 25, 2026

  • Moves code to compute session output dir to Session obj.
  • Simplifies dump_env by requiring uv and will raise if it cannot be used.

…session output dir to Session obj, adds failover to pip freeze for env capture.

Signed-off-by: rlratzel <rratzel@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jan 25, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: rlratzel <rratzel@nvidia.com>
…fixes2

Signed-off-by: rlratzel <rratzel@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Feb 12, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rlratzel rlratzel changed the title [benchmarking] Adds Slack JSON dump on error, adds failover to pip freeze for env capture [benchmarking] Adds failover to pip freeze for env capture Feb 12, 2026
@rlratzel rlratzel marked this pull request as ready for review February 12, 2026 22:56
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 26, 2026

Additional Comments (1)

benchmarking/run.py, line 309
Validation will fail because Session.assert_valid_config_dict now requires "name" in config_dict (line 89 in session.py), but "name" isn't added until line 309. Move lines 308-309 before line 301.

    session_name = args.session_name or time.strftime("benchmark-run__%Y-%m-%d_%H-%M-%S_UTC")
    config_dict["name"] = session_name
    try:
        Session.assert_valid_config_dict(config_dict)
        config_dict = remove_disabled_blocks(config_dict)
        config_dict = resolve_env_vars(config_dict)
    except ValueError as e:
        logger.error(f"Invalid configuration: {e}")
        return 1

    session = Session.from_dict(config_dict, args.entries)

Signed-off-by: rlratzel <rratzel@nvidia.com>
Copy link
Copy Markdown
Contributor

@praateekmahajan praateekmahajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but i can't recall why pip freeze should be run ever? recommendation for everyone is to use uv to run curator..

Signed-off-by: rlratzel <rratzel@nvidia.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 26, 2026

Additional Comments (1)

benchmarking/runner/env_capture.py, line 45
Missing FileNotFoundError exception handling. If uv isn't installed, subprocess.check_output() raises FileNotFoundError, which isn't caught here.

    try:
        out = subprocess.check_output(cmd.split(" "), text=True, timeout=timeout)  # noqa: S603
    except FileNotFoundError:
        logger.error(f"Command '{cmd}' not found. Ensure 'uv' is installed and in PATH.")
        raise
    except subprocess.CalledProcessError as e:
        logger.error(f"Command '{cmd}' failed with exit code {e.returncode}. Environment capture failed.")
        raise
    except subprocess.TimeoutExpired:
        logger.error(f"Command '{cmd}' timed out after {timeout} seconds. Environment capture failed.")
        raise

Signed-off-by: rlratzel <rratzel@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants