Skip to content

ci: pre-create HNS NAT network before CRI integration tests#2726

Closed
Copilot wants to merge 1 commit intomainfrom
copilot/investigate-hcsshim-failure
Closed

ci: pre-create HNS NAT network before CRI integration tests#2726
Copilot wants to merge 1 commit intomainfrom
copilot/investigate-hcsshim-failure

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 7, 2026

Summary

The Run containerd CRI integration tests step in the integration-tests job has been failing intermittently because no HNS nat network exists on the runner when CRI tests start, so hcnCreateNetwork fails downstream.

This adds a small Pre-create HNS NAT network step that runs immediately before the CRI integration tests. It:

  1. Restarts the hns service so HNS re-enumerates the host vNIC.
  2. Downloads HNS.psm1 from microsoft/SDN (the same module kube-proxy and containerd CI use) — New-HnsNetwork / Get-HnsNetwork are not built-in cmdlets, they live in this module. The URL is pinned to commit 710cad6fc9025c86e04ef5daa6eb53f577802448, which is the latest commit touching this file (no changes since 2018), so the fetch is reproducible.
  3. Creates a nat HNS network with 172.19.208.0/20 / gateway 172.19.208.1 only if one doesn't already exist. This subnet matches the default Windows containers / HNS picks for the built-in nat network on Server 2022, so it won't clash with anything the CNI plugin or containerd later creates.
  4. Prints Get-HnsNetwork for diagnostics.

Why option A (vendor/fetch HNS.psm1)

New-HnsNetwork is not a built-in Windows cmdlet, which is why the manual reproduction in the issue produced The term 'New-HnsNetwork' is not recognized…. The other approaches considered (calling Invoke-HnsRequest directly, or relying on the in-box HostNetworkingService module) either don't exist out-of-box or don't expose a one-shot NAT-create API. Pulling HNS.psm1 from microsoft/SDN is the same pattern used by kube-proxy, calico, and the containerd CI itself, so it's the most reliable fix.

Scope

  • Single workflow file change: .github/workflows/ci.yml
  • The enclosing integration-tests job is windows-only (matrix.os: [windows-2022]), so no if: runner.os == 'Windows' guard is needed.
  • No production code, no test code, no dependencies changed.

Copilot AI requested a review from rawahars May 7, 2026 07:40
@rawahars rawahars marked this pull request as ready for review May 7, 2026 07:41
@rawahars rawahars requested a review from a team as a code owner May 7, 2026 07:41
Copilot AI requested a review from rawahars May 7, 2026 10:13
@rawahars rawahars closed this May 7, 2026
@rawahars rawahars deleted the copilot/investigate-hcsshim-failure branch May 7, 2026 11:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants