Replace actor eth0 move with veth networking#110
Replace actor eth0 move with veth networking#110Eitan Yarmush (EItanya) wants to merge 5 commits into
Conversation
b66b5d1 to
2e8c8fd
Compare
| Policy: &acceptPolicy, | ||
| }) | ||
|
|
||
| c.AddRule(&nftables.Rule{ |
There was a problem hiding this comment.
Nit: can you put the programs for each chain right below the corresponding chain definition?
| Type: nftables.ChainTypeFilter, | ||
| Hooknum: nftables.ChainHookForward, | ||
| Priority: nftables.ChainPriorityFilter, | ||
| Policy: &acceptPolicy, |
There was a problem hiding this comment.
Nit: You can use ptr.To: https://pkg.go.dev/k8s.io/utils/ptr#To
| Chain: postrouting, | ||
| Exprs: append(ipSourceEqual(actorVethIP), &expr.Masq{}), | ||
| }) | ||
| preroutingExprs := append(ipDestinationEqual(podIP.String()), tcpDestinationPortEqual(80)...) |
There was a problem hiding this comment.
Can you leave a few TODOs here:
- We need to handle inbound UDP as well (actors may run QUIC servers).
- We need to handle multiple, configurable inbound ports. (I think maybe that will require multiple rules on the prerouting chain? Or maybe a more complicated rule that tests against multiple ports).
There was a problem hiding this comment.
We could instead DNAT the IP without filtering on ports and avoid the port config, since we don't need network traffic except for the actor currently.
EDIT: Spoke with Taahir, let's not :-)
| return fmt.Errorf("while moving actor veth peer into interior netns: %w", err) | ||
| } | ||
|
|
||
| if err := netNSDo(ctx, s.interiorNetNS, configureActorVeth); err != nil { |
There was a problem hiding this comment.
This is really the only part that needs to be done on run / restore, I think, since gVisor wipes the routes from the links in the interior namespace. Everything else could be done on startup, at the same time we create the namespace.
(Except, eventually, I guess we will have per-actor egress rules we also program into nftables, so maybe wiping everything on each run makes more sense)
There was a problem hiding this comment.
Agreed, but this also depends on the outcome of #123 and how the different ateom-* conversation plays out. Either way I think it's ok for now
| actorVethCIDR = "10.200.0.2/30" | ||
| actorVethGateway = "10.200.0.1" | ||
| actorVethIP = "10.200.0.2" | ||
| defaultActorPort = "80" |
There was a problem hiding this comment.
This is unused.
| // current router assumptions: actor egress is masqueraded behind the worker | ||
| // pod IP, and inbound traffic to the worker pod's HTTP port is DNAT'd to the | ||
| // actor veth IP. Later transparent egress capture will replace the broad | ||
| // egress NAT with AgentGateway-bound capture rules. |
There was a problem hiding this comment.
... Have we decided to use AgentGateway? cc Bowei Du (@bowei)
AFAICT this is the first reference in this project.
There was a problem hiding this comment.
This was a mistaken comment, I will fix. My agent decided to hallucinate a bit based on previous stuff I was doing.
Benjamin Elder (BenTheElder)
left a comment
There was a problem hiding this comment.
needs a make update or the LICENSES script
thanks for working on this :-)
Fixes #122
Summary
This replaces the
ateom-gvisornetworking path that moved the worker pod's Kubernetes-providedeth0into the actor/gVisor network namespace.Instead, the worker pod keeps its real
eth0, andateom-gvisorcreates a point-to-point veth pair between the worker pod namespace and the actor namespace. The actor-side peer is renamed toeth0, receives the actor-side address, and uses the worker-side veth as its default gateway.The PR also adds temporary nftables compatibility rules so existing inbound and outbound behavior continues to work while preserving the worker pod's own network connectivity.
Why
Moving the pod's real
eth0makes the worker pod lose normal Kubernetes network connectivity while an actor is active. That blocks pod-local networking components, including the planned transparent egress capture and AgentGateway integration, because those components remain in the worker pod namespace while actor traffic leaves through an interface that was moved elsewhere.Keeping
eth0in the worker pod namespace gives Substrate a stable worker-owned networking boundary for future transparent egress policy enforcement.Validation
go test ./cmd/ateom-gvisor ./cmd/atelet ./internal/ateompath ./internal/controllersgo test ./cmd/ateom-gvisor ./internal/serverbootNO_DEV_ENV=true BUCKET_NAME=ate-snapshots KO_DOCKER_REPO=localhost:5001 KUBECTL_CONTEXT=kind-kind ./hack/run-e2e.sh ./internal/e2e/suites/demo -run TestDemo3 -count=1Checklist