Skip to content

Fix IP Allocation Bug: Reserved Range Not Detected#2657

Open
neddp wants to merge 5 commits intomainfrom
fix_ip_allocation_from_reserved_range_in_dynamic_networks
Open

Fix IP Allocation Bug: Reserved Range Not Detected#2657
neddp wants to merge 5 commits intomainfrom
fix_ip_allocation_from_reserved_range_in_dynamic_networks

Conversation

@neddp
Copy link
Member

@neddp neddp commented Jan 29, 2026

What is this change about?

This PR fixes a bug where BOSH allocates IPs from reserved CIDR ranges, causing CPI failures with "Address is in subnet's reserved address range" errors.

Root Cause: The overlap check in find_next_available_ip was unidirectional. It only checked if the candidate IP includes a blocking IP, but not if a blocking IP includes the candidate.

When allocating a /32 (single IP) and the reserved range is a /30 (4 IPs), the check candidate.include?(blocking) always returns false because a smaller block cannot include a larger one. The reserved range was invisible to the algorithm.

The Bug

# Original (buggy) - only checks one direction
if filtered_ips.any? { |ip| current_prefix.include?(ip) }

Example:

  • Candidate: 10.0.11.32/32
  • Reserved: 10.0.11.32/30
  • Check: 10.0.11.32/32.include?(10.0.11.32/30)false
  • Result: Algorithm thinks IP is available → CPI rejects it

The Fix

# Fixed - checks both directions
blocking_ip = filtered_ips.find do |ip|
  current_prefix.include?(ip) || ip.include?(current_prefix)
end

Now 10.0.11.32/30.include?(10.0.11.32/32)true → correctly blocked.

Additional Improvement: Deterministic Deduplication

The deduplication logic that removes redundant IPs (e.g., /32 entries already covered by a /30 block) has been refactored to sort IPs before processing. This ensures deterministic behavior regardless of Set iteration order, making the code easier to test and debug.

What tests have you run?

  • All existing unit tests pass
  • Added edge case tests for overlapping CIDR blocks with different prefix sizes

Release notes

Fixed: IP allocation now correctly detects reserved CIDR ranges. Previously, when allocating single IPs (/32) from a subnet with reserved ranges specified as larger blocks (e.g., /30), the algorithm failed to detect the overlap and attempted to allocate reserved IPs, causing CPI errors.

Breaking change?

No. This is a bug fix.

@neddp neddp marked this pull request as ready for review January 29, 2026 11:34
@neddp neddp requested a review from fmoehler January 29, 2026 11:49
@aramprice aramprice requested review from a team and s4heid and removed request for a team January 29, 2026 16:03
@aramprice aramprice moved this from Inbox to Pending Review | Discussion in Foundational Infrastructure Working Group Jan 29, 2026
@neddp neddp changed the title Fix IP Allocation Bug: Non-Deterministic CIDR Deduplication Fix IP Allocation Bug: Reserved Range Not Detected Feb 6, 2026
@fmoehler
Copy link
Contributor

fmoehler commented Feb 11, 2026

Thanks for looking into this. Are you sure this is a real issue? I understand the issue from the description, but there were tests that already did cover the scenario in question e.g. in ip_repo_spec.rb:299 and they worked. I also checked out this branch, but used the logic from the current "main" branch in ip_repo.rb and all new tests were still green:

image

The scenario seems to be so basic that I think we should have stumbled across this before.

Comment on lines +99 to +107
# Sort by address first, then by prefix (smaller prefix = larger block = earlier)
sorted_ips = addresses_we_cant_allocate.sort_by { |ip| [ip.to_i, ip.prefix] }

# Remove IPs contained within larger CIDR blocks
sorted_ips = sorted_ips.reject.with_index do |ip, index|
sorted_ips[0...index].any? do |other_ip|
other_ip.prefix < ip.prefix && other_ip.include?(ip)
rescue StandardError
false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this whole change needed? For performance reasons? Logic wise it seems to get to the same result as the previous code or am I wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see comment #2657 (comment)

Comment on lines +72 to +78
sorted_restricted_ips = restricted_ips.to_a.sort_by { |ip| [ip.to_i, ip.prefix] }

deduplicated_ips = sorted_restricted_ips.reject.with_index do |ip, index|
sorted_restricted_ips[0...index].any? do |other_ip|
other_ip.prefix < ip.prefix && other_ip.include?(ip)
rescue StandardError
false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. Why is this needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Pending Review | Discussion

Development

Successfully merging this pull request may close these issues.

3 participants