Conversation
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
|
Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
|
/ok to test |
📝 WalkthroughWalkthroughUpdated GitHub Actions workflow inputs in the CI/CD pipeline to use non-NVIDIA variable values for runner prefix, test data path, and registry. Changed the build-container workflow's build reference from a pre-flight output to a literal Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/cicd-main.yml (1)
289-302:⚠️ Potential issue | 🔴 CriticalHardcoded
build-ref: mainbreaks PR container builds.Changing
build-reffrom${{ needs.pre-flight.outputs.test_sha }}to the literalmainmeans the container will always be built from themainbranch, regardless of which PR triggered the workflow.This creates two problems:
- PR changes won't be tested: The container won't include code from the PR branch
- Inconsistency with build-arg: Line 302 still uses
NEMO_RL_COMMIT=${{ needs.pre-flight.outputs.test_sha }}, so the recorded commit won't match the actual code in the containerSuggested fix
- build-ref: main + build-ref: ${{ needs.pre-flight.outputs.test_sha }}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/cicd-main.yml around lines 289 - 302, The workflow hardcodes build-ref: main which prevents PR containers from being built from the PR commit; update the uses block so build-ref uses the pre-flight output (replace the literal "main" with ${{ needs.pre-flight.outputs.test_sha }}) to match the NEMO_RL_COMMIT build-arg and ensure the container is built from the same commit as the PR; verify the symbols affected are build-ref and NEMO_RL_COMMIT in the uses: NVIDIA-NeMo/FW-CI-templates invocation so both refer to ${{ needs.pre-flight.outputs.test_sha }} for consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/cicd-main.yml:
- Around line 176-185: The workflow job org-member-pre-flight mistakenly sets
default_runner_prefix, default_test_data_path, and default_registry to the
NON_NVIDIA_* variables, which overrides the default (NVIDIA) infrastructure for
all CI runs; revert those three inputs to their intended default values (remove
assignment to NON_NVIDIA_*), or implement a safe toggle (workflow input or
separate debug workflow) so that only the non_nvidia_* inputs use the
NON_NVIDIA_* vars; ensure org-member-pre-flight continues to pass
non_nvidia_runner_prefix/non_nvidia_test_data_path/non_nvidia_registry as the
non-NVIDIA override while keeping default_* pointed to the original default
values.
---
Outside diff comments:
In @.github/workflows/cicd-main.yml:
- Around line 289-302: The workflow hardcodes build-ref: main which prevents PR
containers from being built from the PR commit; update the uses block so
build-ref uses the pre-flight output (replace the literal "main" with ${{
needs.pre-flight.outputs.test_sha }}) to match the NEMO_RL_COMMIT build-arg and
ensure the container is built from the same commit as the PR; verify the symbols
affected are build-ref and NEMO_RL_COMMIT in the uses:
NVIDIA-NeMo/FW-CI-templates invocation so both refer to ${{
needs.pre-flight.outputs.test_sha }} for consistency.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8d5c072f-e2e2-4d6e-8684-3ddfda32532a
📒 Files selected for processing (1)
.github/workflows/cicd-main.yml
| org-member-pre-flight: | ||
| uses: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_cicd_preflight.yml@v0.80.1 | ||
| with: | ||
| default_runner_prefix: ${{ vars.DEFAULT_RUNNER_PREFIX }} | ||
| default_runner_prefix: ${{ vars.NON_NVIDIA_RUNNER_PREFIX }} | ||
| non_nvidia_runner_prefix: ${{ vars.NON_NVIDIA_RUNNER_PREFIX }} | ||
| default_test_data_path: ${{ vars.DEFAULT_TEST_DATA_PATH }} | ||
| default_test_data_path: ${{ vars.NON_NVIDIA_TEST_DATA_PATH }} | ||
| non_nvidia_test_data_path: ${{ vars.NON_NVIDIA_TEST_DATA_PATH }} | ||
| default_registry: ${{ vars.DEFAULT_CONTAINER_REGISTRY }} | ||
| default_registry: ${{ vars.NON_NVIDIA_CONTAINER_REGISTRY }} | ||
| non_nvidia_registry: ${{ vars.NON_NVIDIA_CONTAINER_REGISTRY }} | ||
| sso_users_filename: ${{ vars.SSO_USERS_FILENAME }} |
There was a problem hiding this comment.
Debug configuration should not be merged to main.
Setting default_runner_prefix, default_test_data_path, and default_registry to NON_NVIDIA_* values makes them identical to the non-NVIDIA parameters. This forces all CI runs (including NVIDIA org members) to use Azure infrastructure, eliminating the distinction between default and non-NVIDIA paths.
Given the PR title includes "debug:", this appears to be a temporary testing configuration. If merged, this will permanently route all CI to non-NVIDIA infrastructure, which is likely unintended for production.
If this is intentional for debugging, consider:
- Using a separate workflow file or branch for Azure testing
- Adding a workflow input to toggle infrastructure selection
- Reverting these changes before merging
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/cicd-main.yml around lines 176 - 185, The workflow job
org-member-pre-flight mistakenly sets default_runner_prefix,
default_test_data_path, and default_registry to the NON_NVIDIA_* variables,
which overrides the default (NVIDIA) infrastructure for all CI runs; revert
those three inputs to their intended default values (remove assignment to
NON_NVIDIA_*), or implement a safe toggle (workflow input or separate debug
workflow) so that only the non_nvidia_* inputs use the NON_NVIDIA_* vars; ensure
org-member-pre-flight continues to pass
non_nvidia_runner_prefix/non_nvidia_test_data_path/non_nvidia_registry as the
non-NVIDIA override while keeping default_* pointed to the original default
values.
Signed-off-by: Charlie Truong <chtruong@nvidia.com>
|
/ok to test |
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit