Skip to content

web: configurable IP TTL / hop-limit / DSCP on listening socket#396

Open
randomizedcoder wants to merge 7 commits into
prometheus:masterfrom
randomizedcoder:ttl
Open

web: configurable IP TTL / hop-limit / DSCP on listening socket#396
randomizedcoder wants to merge 7 commits into
prometheus:masterfrom
randomizedcoder:ttl

Conversation

@randomizedcoder
Copy link
Copy Markdown

Summary

Add three configurable IP-layer header fields on the exporter's listening socket — IPv4 TTL, IPv6 Hop Limit, and DSCP. Configuration via CLI flag, environment variable, or YAML web-config.file. Two motivations:

  1. Defense-in-depth via low TTL. TTL=2 means scrape responses cannot escape past the second router hop, even if firewalls or ACLs ever fail open. The exporter becomes incapable of leaking metric data beyond the immediate L3 neighborhood.
  2. QoS via DSCP marking. Operators on networks with traffic shaping / differentiated services can mark scrape traffic with a specific codepoint (CS2, AF11, EF, …) so intermediate equipment classifies it correctly.

All three knobs touch the IP header on the same socket via similar mechanisms, so they ship together. Every toolkit-using exporter (node_exporter, blackbox_exporter, snmp_exporter, …) gains the capability with a single change here.

Configuration surface

Knob Flag Env var YAML field Range
IPv4 TTL --web.ipv4-ttl WEB_IPV4_TTL ip_socket_config.ipv4_ttl 1–255
IPv6 Hop Limit --web.ipv6-hop-limit WEB_IPV6_HOP_LIMIT ip_socket_config.ipv6_hop_limit 1–255
DSCP --web.dscp WEB_DSCP ip_socket_config.dscp 0–63

Precedence: flag > env > YAML > kernel default. Omitting all sources leaves kernel defaults — full backwards compatibility.

How it works

  • TTL / Hop Limit are set on the listening socket via net.ListenConfig.Control. On Linux these options are inherited by accepted connections including the SYN-ACK packet (accept(2), ip(7), ipv6(7)).
  • DSCP is NOT inherited (IP_TOS / IPV6_TCLASS reset on accept), so a thin ipSocketListener wraps the underlying net.Listener and applies the option to each accepted connection's FD via (*net.TCPConn).SyscallConn(). ECN bits in the ToS byte are deliberately left for the kernel to manage per-packet (RFC 3168).
  • Three listener flavors handled:
    • Regular TCP: net.ListenConfig.Control + ipSocketListener wrap.
    • VSOCK: logs "Ignoring IP socket options on VSOCK listener" and proceeds unchanged (no IP layer).
    • Systemd socket activation: options applied post-bind via (*net.TCPListener).SyscallConn(); same wrapper added if DSCP is configured.

Platform support

Platform Status
Linux Fully supported; CI-tested.
FreeBSD / DragonFly / NetBSD / OpenBSD / Darwin Compile-supported via golang.org/x/sys/unix; no CI run yet.
Windows / Plan 9 / JS+Wasm / others No-op. First time any option is configured, a single warn-level log line is emitted and the values are ignored.

Commit organization

Each commit builds cleanly on its own (go build, go vet, go test ./web/...) and is independently revertable.

# Commit Notes
1 docs: add design brief Pre-impl design (docs/ip-socket-config-design.md). Drop if you don't want design docs in docs/.
2 web: add IP socket-options config plumbing FlagConfig fields, IPSocketConfig YAML struct, kingpinflag registrations, validator. Pointer types so YAML-absent ≠ explicit-zero (load-bearing for DSCP=0, which is a valid configured value).
3 web: apply IP socket options on the listening socket and accepted conns applyListenerOptions (TTL/Hop) + applyConnOptions (DSCP) — split because Linux inherits the former but not the latter — plus ipSocketListener wrapper and wiring in ListenAndServe.
4 web: tests for IP socket options 5 new testdata YAMLs into existing TestYAMLFiles; Linux-gated TestApplySocketOptions_Inheritance with 9 subtests covering positive / boundary / corner cases; TestEffective for the precedence helper.
5 web: docs and example for ip_socket_config docs/web-configuration.md section + commented example in web/web-config.yml.
6 [optional] web: refactor handler_test.go to table-driven Optional, labeled in commit subject. Folds the 4 existing test functions in handler_test.go into one TestHandler table. No behavior change — same assertions, same outcomes. Justified on maintainability (adding a 5th test of the same shape becomes one row), not on enabling this feature. Drop via git rebase -i if you'd rather keep the existing structure; the feature commits before it are unaffected.
7 web: validate DSCP flag value at resolve time Bug fix discovered during end-to-end testing: kingpin.Int() accepted --web.dscp=999 and the kernel silently took the low byte. Adds a flag-level range check + test. Lands after commit 6 only because reordering would have needed an interactive rebase; this commit should be kept regardless of whether commit 6 is dropped.

Backwards compatibility

  • New fields on FlagConfig are pointer-typed (nil by default). Existing struct literals that name only the old fields continue to compile.
  • New YAML block ip_socket_config: is optional. Configs that don't include it behave identically to before.
  • No behavior change for any existing flag/option.

Test plan

  • go test ./web/... clean on Linux.
  • Cross-compile clean for GOOS=darwin,freebsd,windows (CGO disabled, as expected).
  • End-to-end smoke tests against node_exporter (built via go mod replace against this branch): flag-only, env-only, YAML-only, flag-overrides-YAML, validation rejects bad YAML (ipv4_ttl: 0, dscp: 64), validation rejects out-of-range flag (--web.dscp=999).

Design doc

docs/ip-socket-config-design.md (commit 1) is the design brief — covers the rationale for the type asymmetry (uint8 for TTL/Hop, int for DSCP because 0 is valid CS0), the inheritance asymmetry that drove the split between applyListenerOptions and applyConnOptions, the platform matrix, and the test coverage matrix. Includes an Appendix A specifically discussing the optional refactor commit and why it's separable.

Companion doc on the operator-facing side (in node_exporter): https://github.com/randomizedcoder/node_exporter/blob/ttl/docs/IP_SOCKET_CONFIG.md

🤖 Generated with Claude Code

Pre-implementation design brief covering the configurable IP socket
options feature: IPv4 TTL, IPv6 Hop Limit, and DSCP. Documents the
config surface (flag/env/YAML), the socket-option mechanism
(net.ListenConfig.Control + setsockopt), kernel inheritance behavior
on accepted connections, platform support matrix, and the test plan.

Companion to the operator-facing spec in node_exporter/docs/IP_SOCKET_CONFIG.md.

Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
Add the configuration surface for a forthcoming feature that clamps IP
TTL / IPv6 Hop Limit and sets DSCP on the listening socket:

  * IPSocketConfig YAML struct under Config (ip_socket_config: block)
    with *uint8 ipv4_ttl, *uint8 ipv6_hop_limit, *int dscp. Pointer
    types let an absent YAML field be distinguished from an explicit
    zero (load-bearing for DSCP, since DSCP=0 is the valid CS0
    codepoint).
  * FlagConfig fields WebIPv4TTL (*uint8), WebIPv6HopLimit (*uint8),
    WebDSCP (*int) wired through kingpinflag.AddFlags as the three
    new flags --web.ipv4-ttl, --web.ipv6-hop-limit, --web.dscp with
    matching env vars WEB_IPV4_TTL, WEB_IPV6_HOP_LIMIT, WEB_DSCP.
  * validateIPSocketConfig enforces the configured ranges (TTL/hop
    1-255, DSCP 0-63). The uint8 field type rejects negatives and
    overflow at YAML-parse time, so the validator only checks the
    minimum for TTL fields.
  * Promote golang.org/x/sys to a direct require (will be used by
    the platform-specific socket option code in the next commit).

No behavior change yet -- ListenAndServe still ignores the new fields.
The wiring lands in a follow-up commit.

Design: docs/ip-socket-config-design.md
Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
Wire the configuration added in the previous commit through to the
listener and to each accepted connection. The two halves of the work
are split because Linux does NOT inherit all options equally:

  * IP_TTL and IPV6_UNICAST_HOPS ARE inherited by accepted connections
    (accept(2), ip(7), ipv6(7)). Set them on the listening socket via
    net.ListenConfig{Control: applyListenerOptions} so the SYN-ACK and
    every subsequent packet on accepted connections carries the value.
  * IP_TOS and IPV6_TCLASS (DSCP, upper 6 bits) are NOT inherited.
    Accepted connections get the kernel default unless we explicitly
    setsockopt per connection. We do that via an ipSocketListener
    wrapper around the underlying net.Listener that calls
    applyConnOptions on every Accept.

Listener flavors:
  * Regular TCP: net.ListenConfig.Control sets listener options; the
    listener is wrapped with ipSocketListener if DSCP is configured.
  * Systemd socket activation: applyListenerOptions is called post-bind
    via (*net.TCPListener).SyscallConn(); the listener is then wrapped
    if DSCP is configured.
  * VSOCK: no IP layer; if any IP option is configured we log an info
    line and proceed unchanged.

Platform support:
  * Linux + BSD family: real setsockopt implementation in
    socket_options_unix.go using golang.org/x/sys/unix.
  * Other (Windows, Plan 9, JS/Wasm): socket_options_other.go is a
    no-op that emits a single warn-level slog message when any option
    is configured.

Implementation detail: applyListenerOptions and applyConnOptions try
both v4 and v6 options when configured and silently swallow ENOPROTOOPT
on the inapplicable family. This lets us safely set v4+v6 options on
a dual-stack listener (e.g. [::]:9100) without inspecting the socket
family up front. ECN bits (lower 2 bits of the ToS byte) are NOT
touched -- the kernel manages them per-packet for ECN-capable TCP
(RFC 3168).

The effective[T] generic helper resolves flag > YAML > default
precedence; resolveSocketOptions composes flag values with YAML
values from the (optional) web-config file.

Design: docs/ip-socket-config-design.md
Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
Two new test surfaces matching the toolkit's existing light test style:

  * 5 testdata YAML files (4 bad + 1 good) wired into the existing
    TestYAMLFiles table-driven test in tls_config_test.go. Bad files
    cover the validation rules: ipv4_ttl=0, ipv4_ttl=256 (uint8
    overflow at parse time), dscp=-1, dscp=64. Good file exercises a
    full ip_socket_config: block.
  * TestApplySocketOptions_Inheritance in a new
    socket_options_linux_test.go (//go:build linux) -- the load-bearing
    test for this feature. Nine subtests covering positive, boundary
    and corner cases: TTL=1 (security extreme), TTL=255 (boundary),
    DSCP=0 (corner: explicit 0 IS configured), DSCP=63 (boundary),
    mid-range values, all-options-combined, and dual-stack [::]:0.
    Each subtest builds the full listener stack (ListenConfig.Control
    + ipSocketListener wrapper), dials in, and verifies via
    getsockopt on both the listener FD and the accepted connection FD
    that the right values are in place. The accepted-connection check
    is what catches a regression in either the listener-side
    inheritance or the per-connection DSCP application.
  * TestEffective table-driven test for the effective[T] precedence
    helper, including the corner case that DSCP=0 from a flag IS
    treated as configured (because the sentinel for DSCP is -1, not 0).

ENOPROTOOPT is swallowed inside setIfApplicable so v4 and v6 options
can be set unconditionally on dual-stack listeners; the test confirms
this works for the [::]:0 case.

No new mocking infrastructure, no new go.mod deps, no testify -- the
patterns match handler_test.go (real net.Listen + dial-in) and
TestYAMLFiles (testdata + regex error matching).

Design: docs/ip-socket-config-design.md
Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
Add operator-facing documentation for the IP socket-options feature:

  * docs/web-configuration.md gains an ip_socket_config: block in the
    YAML schema (with placeholder fields per the toolkit's [<type>]
    convention) and an "About ip_socket_config" section describing
    listener-flavor support (TCP / systemd / VSOCK), platform support
    (Linux first-class, BSD/Darwin compile-only, Windows/others no-op
    with warning), and operator notes (why TTL=0 is rejected; why
    DSCP=0 is a valid configured value; dual-stack behavior).
  * web/web-config.yml gains a commented-out example block showing
    TTL=2, hop_limit=2, dscp=16 (CS2).

CHANGELOG is intentionally not touched: the toolkit doesn't track a
CHANGELOG file (release notes come from GitHub releases).

Design: docs/ip-socket-config-design.md
Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
Fold the four separate test functions in handler_test.go
(TestBasicAuthCache, TestBasicAuthWithFakepassword, TestByPassBasicAuthVuln,
TestHTTPHeaders) into a single TestHandler driven by a small table. Each
test's distinct logic stays in its own per-case function; the shared
server lifecycle (build http.Server, launch ListenAndServe in a goroutine,
waitForPort, shutdown) is extracted into withHandlerServer so it isn't
duplicated four times.

No behavior change. Every assertion that ran before still runs, and the
set of passing test outcomes is identical. The subtest names (BasicAuthCache,
BasicAuthWithFakepassword, ByPassBasicAuthVuln, HTTPHeaders) map 1:1 to
the original function names so git log -p review is easy.

This commit is OPTIONAL. It is justified on maintainability (adding a
fifth test of the same shape becomes one table row instead of a new
~30-line function) rather than on enabling any feature. Reviewers who
prefer to keep the four separate test functions can drop this commit
via `git rebase -i upstream/master` and the feature commits before it
will be unaffected. See Appendix A of docs/ip-socket-config-design.md
for the full design rationale.

Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
Plug a gap in DSCP validation. The YAML-path validator
validateIPSocketConfig was already running inside getConfig, but the
flag path bypassed it: kingpin.Int() accepts any integer, so
--web.dscp=999 silently flowed all the way into setsockopt where the
kernel took the low byte of (999 << 2) and produced an unrelated DSCP
value (e.g. 999 -> 0x9c -> DSCP 39 / AF42).

Fix: add an explicit range check in resolveSocketOptions after the
effective[T] resolution, where flag- and YAML-sourced values converge.
TTL/Hop-Limit don't need this guard -- kingpin.Uint8() already rejects
negative and >255 at parse time, and the 0 sentinel means "not
configured".

Test: TestResolveSocketOptions_FlagValidation covers the valid range
(0, 63), the sentinel case (-1 means unset), and the rejected cases
(999, -5, 64).

Signed-off-by: randomizedcoder dave.seddon.ca@gmail.com <dave.seddon.ca@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant