v1.0 magic#93
Open
totoCZ wants to merge 10 commits into
Open
Conversation
NUD probes (RFC 4861 §7.7.3) are sent unicast and SHOULD omit SLLAO since the sender already knows the link-layer address. The previous code hard-returned when SLLAO was absent, so ndppd either dropped the probe entirely or responded with an unsolicited NA (SOLICITED=0) to ff02::1. An unsolicited NA does not satisfy a NUD PROBE transition (§7.3.5), causing the Juniper switch to declare the neighbor unreachable. Fix: pass the Ethernet source MAC through ndL_handle_ns and use it as the fallback src_ll when SLLAO is absent. This allows a properly solicited NA to be sent back to the unicast probe source. Also add NULL src_ll guards in session.c for the DAD (unspecified source) path, and fix the PID file write check which compared the return value of write() to 0 instead of the expected byte count, logging a spurious error on every successful daemonize.
The use-kernel config option conditionally called RTM_NEWNEIGH/ RTM_DELNEIGH to maintain kernel neighbor proxy entries alongside ndppd's own proxying. It was never promoted beyond experimental and is not used in any deployed config. The underlying nd_rt_add_neigh/ nd_rt_remove_neigh helpers in rt.c are retained.
iface.c: - Reject NS with multicast target address (RFC 4861 §7.1.1 MUST) - Discard ND packet containing any option with length zero (RFC 4861 §4.6 MUST) - Iterate all NS options to find SLLAO instead of only inspecting the first - BSD/BPF: fix msg pointer to use per-packet offset (was always buf base, corrupting all field reads and silently dropping all but first BPF record) - BSD/BPF: fix plen sanity check to use bpf_hdr->bh_caplen not total read len session.c: - STALE state: queue subscriber and trigger fresh NUD probe rather than immediately responding with OVERRIDE NA against unconfirmed reachability (RFC 4861 §7.3.3 — STALE means reachability is unknown) - VALID state: refresh state_time on incoming NA so gratuitous NAs from the target extend session lifetime (RFC 4861 §7.2.5) - STALE exponential backoff: guard nd_conf_retrans_limit == 0 modulo (config allows min=0, causing division-by-zero UB) - STALE exponential backoff: cap shift at 20 to prevent signed int overflow Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PACKET_MR_ALLMULTI only enables all-multicast reception; unicast NS frames addressed to a MAC other than the interface's own are dropped by the NIC. This silently breaks proxying whenever an external host holds a stale neighbour-cache entry with an old container veth MAC and sends its NUD probe as a unicast to that address — ndppd never sees the NS and never replies. Running tcpdump worked around the problem because it sets PACKET_MR_PROMISC as a side-effect, making the NIC accept every frame. Switch to PACKET_MR_PROMISC (a strict superset of ALLMULTI) so that all NS packets reach ndppd regardless of their Ethernet destination. The existing BPF filter still limits userspace delivery to ICMPv6 NS/NA only, so there is no meaningful performance impact. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ode) In auto mode, if no route to the container exists when the first NS arrives, nd_session_create() sets state=INVALID immediately. All subsequent NS from the same requester add subscribers to this INVALID session, but nd_session_update() for INVALID only purges them after invalid_ttl — it never retries the probe. The external host gives up long before the TTL expires, producing the symptom of "no reply on new container start." When an NS arrives for an INVALID session that has no interface (i.e. went invalid because of a missing route, not a failed probe), re-check the routing table. If the route has appeared in the meantime, open the downstream interface, transition to INCOMPLETE, and start probing so queued subscribers get an answer once the container replies with NA. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
nd_iface_send_ns() was building the solicited-node Ethernet multicast destination as 33:33:tgt[12:15] — i.e. taking all four last bytes of the TARGET address. For EUI-64-derived container addresses, byte 12 is always 0xfe (the lower byte of the ff:fe EUI-64 marker), so ndppd was sending to 33:33:fe:XX:XX:XX instead of the correct 33:33:ff:XX:XX:XX. The solicited-node multicast Ethernet MAC is derived from the last four bytes of the IPv6 *destination* (ff02::1:ffXX:XXXX), not the target. That address has 0xff at byte 12, not 0xfe. The correct MAC is always 33:33:ff:tgt[13]:tgt[14]:tgt[15]. Practical impact: containers with EUI-64 addresses never received ndppd's NS probe, so they never replied with NA, and ndppd's iface-mode sessions stayed INCOMPLETE indefinitely. Running tcpdump happened to put the bridge/veth into promiscuous mode, which flooded the wrong-MAC frame to all ports and accidentally made it work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The -v verbosity flag was comparing against ND_LOG_ERROR (0) — always
false — so it never did anything. Fix the comparison to ND_LOG_TRACE.
Default verbosity changed from TRACE to INFO so -vvv reaches trace.
Add nd_log_debug/trace at every silent return path so packet drops are
visible in -vvv output:
- unknown ifindex, short frame, ethertype, plen mismatch in io handler
- hop-by-hop truncation, non-ICMPv6 nxt, ICMPv6 checksum mismatch,
hop-limit != 255 in ndL_handle_msg
- non-proxy iface, short NS, multicast target in ndL_handle_ns
- no rule match in nd_proxy_handle_ns
- no session match for incoming NA in ndL_handle_na
- no src_ll (DAD / no SLLAO) in nd_session_handle_ns
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After fork(), both parent and child share the same AF_PACKET socket. The parent's atexit handler (nd_iface_cleanup → nd_iface_close) was calling PACKET_DROP_MEMBERSHIP on the shared socket before the parent exited, silently removing the PROMISC membership that the child daemon still needed. Result: the daemon started without promiscuous mode and missed multicast NS packets until tcpdump happened to re-enable promisc. nd_iface_no_restore_flags already existed for exactly this purpose but was never set. Set it in the parent branch of ndL_daemonize() before exit(0), and guard the PACKET_DROP_MEMBERSHIP call in nd_iface_close() behind the flag so the parent's cleanup is a no-op on the socket. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This makes v1 production ready.