Skip to content

fix: allow wandb config value changes on resume#2137

Open
gkaplun-nvidia wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
gkaplun-nvidia:fix/wandb-allow-val-change
Open

fix: allow wandb config value changes on resume#2137
gkaplun-nvidia wants to merge 1 commit intoNVIDIA-NeMo:mainfrom
gkaplun-nvidia:fix/wandb-allow-val-change

Conversation

@gkaplun-nvidia
Copy link

@gkaplun-nvidia gkaplun-nvidia commented Mar 22, 2026

Summary

Add allow_val_change=True to wandb.config.update() calls in both WandbLogger and MLflowWandbLogger.

Problem

When resuming a wandb run, config.update() raises an error if a hyperparameter key already exists with a different value. This is common when hyperparameters are re-logged on resume (e.g., if the config is passed again during setup()).

Fix

Pass allow_val_change=True so that config values can be overwritten without raising. This matches the expected behavior for resumed runs where the same config is re-applied.

Test plan

  • Resume a wandb run and verify config updates don't crash
  • Verify new config values are reflected in the wandb dashboard

Summary by CodeRabbit

  • Bug Fixes
    • Improved hyperparameter logging to correctly handle configuration updates when values change across experiment runs.

When resuming a wandb run, `config.update()` raises an error if a key
already exists with a different value. This is common when
hyperparameters are re-logged on resume. Adding `allow_val_change=True`
prevents the crash while still updating the config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gkaplun-nvidia gkaplun-nvidia requested a review from a team as a code owner March 22, 2026 09:41
@copy-pr-bot
Copy link

copy-pr-bot bot commented Mar 22, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 22, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 78131fa3-6f77-4557-8da4-cae39c6dd7f7

📥 Commits

Reviewing files that changed from the base of the PR and between 9feb4b0 and 57b9da1.

📒 Files selected for processing (1)
  • nemo_rl/utils/logger.py

📝 Walkthrough

Walkthrough

Updated hyperparameter logging in both WandbLogger and SwanlabLogger classes to pass allow_val_change=True parameter to the self.run.config.update() method, enabling hyperparameter value changes during repeated updates in run configurations.

Changes

Cohort / File(s) Summary
Hyperparameter Logging Configuration
nemo_rl/utils/logger.py
Updated log_hyperparams methods in WandbLogger and SwanlabLogger to pass allow_val_change=True to self.run.config.update() calls, allowing hyperparameter values to be updated when they differ on repeated configurations.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly addresses the main change: enabling wandb config value changes during run resume by passing allow_val_change=True to config.update() calls.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Test Results For Major Changes ✅ Passed PR contains only minor changes (2-line bug fix) adding allow_val_change=True parameter to config.update() calls without introducing features, breaking changes, or significant refactoring.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use Trivy to scan for security misconfigurations and secrets in Infrastructure as Code files.

Add a .trivyignore file to your project to customize which findings Trivy reports.

@chtruong814 chtruong814 added the needs-follow-up Issue needs follow-up label Mar 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants