Skip to content

fix(cluster): add Jetson Linux platform compatibility#568

Open
elezar wants to merge 1 commit intomainfrom
fix/jetson-platform-compat
Open

fix(cluster): add Jetson Linux platform compatibility#568
elezar wants to merge 1 commit intomainfrom
fix/jetson-platform-compat

Conversation

@elezar
Copy link
Member

@elezar elezar commented Mar 24, 2026

Summary

Add Jetson Linux platform compatibility for OpenShell clusters. Two root-cause issues affect Jetson devices: the nf_tables iptables backend lacks the nft_compat bridge needed for xt extension modules, and the br_netfilter kernel module is not loaded by default, breaking pod-to-service DNS resolution.

Inspired by community testing on Jetson AGX Orin, Orin NX, Orin Super, and Nano reported in the NVIDIA Developer Forums.

Note: This PR addresses cluster startup and networking compatibility on Jetson. GPU/accelerator passthrough support for Jetson devices is not yet addressed and will be tracked separately.

Related Issue

Closes #467
Closes #407

Changes

  • cluster-entrypoint.sh: On startup, probe whether xt extension modules are usable via the current iptables backend. If not, switch to iptables-legacy via update-alternatives. Also check for br_netfilter and emit an actionable warning if it is absent (pods cannot reach ClusterIP services / kube-dns without it). Disable the k3s network policy controller when falling back to legacy iptables, since kube-router panics without xt_comment.
  • netns.rs: Update find_iptables() to return String instead of &'static str so it can return a dynamically constructed iptables-legacy path. Add xt_extensions_unavailable() probe that mirrors the shell probe, allowing the sandbox egress-policy engine to fall back to iptables-legacy on affected kernels.
  • debug-openshell-cluster/SKILL.md: Document the br_netfilter warning and the pod-to-service connectivity failure pattern, including remediation steps.

Testing

  • mise run pre-commit passes
  • Unit tests added/updated
  • E2E tests added/updated — requires a physical Jetson device

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@elezar elezar self-assigned this Mar 24, 2026
@elezar elezar marked this pull request as ready for review March 24, 2026 14:09
@elezar elezar requested a review from a team as a code owner March 24, 2026 14:09
Three issues prevent k3s from starting on kernels where the nf_tables
xt extension bridge (nft_compat) is unavailable:

1. kube-router's network policy controller uses the xt_comment iptables
   extension and panics on startup with "Extension comment revision 0
   not supported, missing kernel module?" Pass --disable-network-policy
   to k3s so the controller never runs. The NSSH1 HMAC handshake remains
   the primary sandbox SSH isolation boundary, so this does not weaken
   the effective security posture.

2. flannel and kube-proxy also fail to insert rules via the nf_tables
   iptables backend on the same kernels. Add an xt_comment probe at
   cluster-entrypoint startup; if the probe fails, switch to
   iptables-legacy via update-alternatives before any other netfilter
   work so that flannel, kube-proxy, and the DNS proxy all use a
   consistent backend.

3. The br_netfilter kernel module must be loaded on the host for
   iptables rules to apply to pod bridge traffic. Without it, ClusterIP
   DNAT (including kube-dns at 10.43.0.10) is never applied to pod
   packets, causing silent DNS timeouts deep in the health-check loop.
   Add an early check that fails fast with an actionable error message
   if the module is not present, instructing the user to run
   `sudo modprobe br_netfilter` on the host.

Signed-off-by: Evan Lezar <elezar@nvidia.com>
@elezar elezar force-pushed the fix/jetson-platform-compat branch from f94333b to fbf22fb Compare March 24, 2026 14:31
@elezar
Copy link
Member Author

elezar commented Mar 24, 2026

cc @johnnynunez

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant