Skip to content

Conversation

@hlinsen
Copy link
Contributor

@hlinsen hlinsen commented Jan 23, 2026

This PR fixes a bug observed on miplib on H100 during concurrent root solve.
The number of nnz in columns can be equal to m.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced internal safety checks to prevent potential overflow conditions in barrier algorithm computations.

✏️ Tip: You can customize this high-level summary in your review settings.

@hlinsen hlinsen added this to the 26.02 milestone Jan 23, 2026
@hlinsen hlinsen requested a review from a team as a code owner January 23, 2026 23:59
@hlinsen hlinsen added the bug Something isn't working label Jan 23, 2026
@hlinsen hlinsen requested a review from akifcorduk January 23, 2026 23:59
@hlinsen hlinsen added the non-breaking Introduces a non-breaking change label Jan 23, 2026
@hlinsen hlinsen requested a review from Kh4ster January 23, 2026 23:59
@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 23, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai
Copy link

coderabbitai bot commented Jan 24, 2026

📝 Walkthrough

Walkthrough

The change increases histogram array sizes by one slot (columns: m+1, rows: n+1) in the dual simplex CUDA barrier code and adds runtime assertions to prevent histogram bin index overflow. No public APIs or exported entities were changed.

Changes

Cohort / File(s) Summary
Histogram Array Sizing & Overflow Guards
cpp/src/dual_simplex/barrier.cu
Increased histogram array allocation by one slot for column and row histograms and added runtime assertions to guard against out-of-bounds increments when updating histogram bins.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly describes the main bug fix: preventing out-of-bounds access in the find dense columns logic when nnz equals m.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@cpp/src/dual_simplex/barrier.cu`:
- Around line 1147-1153: The loop that builds histogram_row accesses
histogram_row[row_nz[k]] before validating the index; move the cuopt_assert that
checks row_nz[k] <= n to immediately before the histogram_row[row_nz[k]]++ line
(in the same loop that iterates k) so the bound is validated prior to the array
access; keep updating max_row_nz and incrementing histogram_row only after the
assertion.

Copy link
Contributor

@chris-maes chris-maes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix @hlinsen . Apologies for my buggy code.

@hlinsen
Copy link
Contributor Author

hlinsen commented Jan 24, 2026

/ok to test 640a33f

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/dual_simplex/barrier.cu (1)

1147-1164: Fix histogram_row loop bounds to avoid OOB when m > n.
histogram_row is sized n + 1, but subsequent loops use k < m. If m > n, those loops can access past the end of histogram_row.

🐛 Proposed fix
`#ifdef` HISTOGRAM
-    for (i_t k = 0; k < m; k++) {
+    for (i_t k = 0; k <= n; k++) {
       if (histogram_row[k] > 0) { settings.log.printf("%6d %6d\n", k, histogram_row[k]); }
     }
`#endif`

     n_dense_rows = 0;
-    for (i_t k = 0; k < m; k++) {
+    for (i_t k = 0; k <= n; k++) {
       if (histogram_row[k] > .1 * n) { n_dense_rows++; }
     }

@hlinsen
Copy link
Contributor Author

hlinsen commented Jan 24, 2026

/merge

@rapids-bot rapids-bot bot merged commit d72865d into NVIDIA:release/26.02 Jan 24, 2026
196 of 197 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants