Skip to content

(8) ACLs#1565

Draft
daniel-noland wants to merge 33 commits into
pr/daniel-noland/threading-rewritefrom
pr/daniel-noland/cascade-acl
Draft

(8) ACLs#1565
daniel-noland wants to merge 33 commits into
pr/daniel-noland/threading-rewritefrom
pr/daniel-noland/cascade-acl

Conversation

@daniel-noland
Copy link
Copy Markdown
Collaborator

Introduces two new crates, scoped narrowly to "1Hz ACL updates" so we can develop and experiment with the cascade and update logic without the 6-10k-line LSM avalanche described in the rfc-lsm branch.

dataplane-cascade: head + frozen[] + tail with multi-writer head absorption via the Upsert trait (commutative+associative folding). Slot-published Cascade with rotate (freeze head, install new) and compact (fuse frozen into tail via MergeInto). Snapshot walks head -> frozen[] -> tail and stops at the first definitive answer. Generation newtype is in place; lookup_at / compact_through exist but aren't load-bearing yet at 1Hz update rate. DrainEvent / subscribe are behind a feature flag, off by default.

dataplane-acl: software ACL classifier wrapping the cascade. AclHead is a Mutex<BTreeMap<Priority, AclRule>>; AclFrozen is a priority-sorted Vec; AclTail is structurally identical (room to swap a richer representation later). Compaction merges sealed layers into the tail with priority dedup.

Deferred for follow-up PRs (already worked out on rfc-lsm but not shipping yet): mat facade crate, mat-runtime (PolicyGenAllocator, ManagedCascade), mat-state-sync (PeerDedup), pressure regimes / try_reset, FlowOrigin / HasOrigin, conntrack-shape cascade.

See .scratch/lsm.md and .scratch/mat-pipeline-rfc/0001-mat-pipeline.md for the broader design this is a thin slice of.

daniel-noland and others added 30 commits May 22, 2026 17:22
Sweep direct `use std::sync::{Arc, Mutex, RwLock, atomic::*}` imports
across the workspace to `concurrency::sync` so loom/shuttle test builds
can route through instrumented primitives via one feature flip.

Two enforcement layers:

  * `clippy.toml` extends `disallowed-types` for the lock primitives.
    parking_lot's lock types are distinct concrete types, so clippy
    sees through the `concurrency::sync` re-export without flagging
    legitimate uses.
  * `.semgrep/rules/no-std-sync-direct.yaml` covers the rest (`Arc`,
    `Weak`, atomics, `LazyLock`, `OnceLock`, `Once`, `Barrier`,
    `Condvar`) where clippy's alias resolution can't distinguish the
    facade re-export from `std::sync`. The `concurrency` crate and its
    tests are exempt by path.

`mgmt/tests/reconcile.rs` keeps a direct `std::sync::Mutex` because
bolero's `catch_unwind` needs `RefUnwindSafe`, which parking_lot's
`Mutex` doesn't impl. Documented inline with `clippy::disallowed_types`
allow + `nosemgrep` annotation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
`std::thread::Builder::spawn_scoped` is inherent on std but missing on
the loom and shuttle Builders; both ship `Scope::spawn` instead. Add a
`concurrency::thread::BuilderExt` trait with one method:

  * std: forwards to the inherent `Builder::spawn_scoped` via
    fully-qualified call (Rust's method resolution prefers the
    inherent, so the trait impl is dead but kept for symmetry).
  * shuttle / loom: discards advisory Builder config, delegates to
    `Scope::spawn`, wraps the infallible return in `Ok` to match
    std's `io::Result` signature.

`use concurrency::thread::BuilderExt;` lets call sites write
`builder.spawn_scoped(scope, f)` under every backend. Used by the
kernel driver's named scoped threads in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Collapse the chained `shuttle_dfs -> shuttle_pct -> shuttle` features
into a single `shuttle` feature backed by `shuttle::PortfolioRunner`.
The runner drives `RandomScheduler` and `PctScheduler` in parallel;
any scheduler finding a counterexample fails the test
(`stop_on_first_failure = true`). `shuttle_dfs` becomes an additive
opt-in that adds `DfsScheduler` to the same portfolio.

`stress.rs` now has one shuttle arm instead of three, and
`#[concurrency::test]` emits one leaf per backend (`loom` / `shuttle`)
instead of three shuttle variants. Workspace consumers (`nat`,
`flow-entry`) and CI (`dev.yml`) drop the `shuttle_pct` step; the
existing `shuttle` step covers Random + PCT in one pass.

Tests previously gated `not(feature = "shuttle_pct")` to opt out of
single-threaded bodies that PCT panics on are rewritten to
`not(feature = "shuttle")` since PCT now runs in every shuttle build.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Tier A of the std::thread sweep that complements the `concurrency::sync`
facade migration. Swap `use std::thread` to `use concurrency::thread`
in two test modules whose tests are candidates for
`#[concurrency::test]` conversion:

  * `routing/src/fib/test.rs` -- prerequisite for the FIB race-test
    conversion later in the stack.
  * `dpdk/src/acl/mod.rs::classify_concurrent_arc_shared` -- import
    swap only; the test runs under real DPDK EAL so the macro
    conversion is deferred.

Production threading sites (`dpdk/src/lcore.rs`, `mgmt/src/processor/
launch.rs`, `routing/src/router/rio.rs`, `dataplane/src/statistics`,
`test-utils`) are left alone -- they need real OS threads and never
compile under loom/shuttle. A later sweep adds clippy/semgrep
enforcement once production sites are also routed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Tier B of the std::thread sweep. `ThreadPortMap` keys a per-thread
`RwLock<HashMap<ThreadId, _>>` by `std::thread::current().id()`. Each
backend ships its own `ThreadId`, so a std-typed map would silently
work in production while loom/shuttle key the table by their own
thread identity. Route the import and call sites through
`concurrency::thread` so the key tracks the active backend.

No behavioural change under the default backend. Prerequisite for any
future loom/shuttle exercise of the NAT allocator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Replace the hand-rolled `shuttle::check_random(... 100)` wrappers in
`shuttle_tests` with `#[concurrency::test]`, which routes bodies
through `concurrency::stress` (loom `model`, shuttle
`PortfolioRunner`).

The module is renamed `concurrency_tests` and gated to `cfg(any(feature
= "shuttle", feature = "loom"))`. `FlowTable::insert` spawns a tokio
task for the flow timer, which would panic without a running runtime;
the existing `start_timer` bypass under shuttle is extended to loom.
Tokio-driven coverage of `insert` stays in `std_tests`.

`test_flow_table_timeout` is dropped from the model-checker mod: it's
single-threaded (PCT rejects), and `std_tests` already has the
authoritative `#[tokio::test(start_paused = true)]` version.

Adds `loom = ["concurrency/loom"]` to `flow-entry/Cargo.toml` so the
macro-emitted cfg arm resolves to a known feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Move `test_fib_removals` and `test_leftright_destroy_race_simple` out
of the `#[concurrency_mode(std)]` block into a sibling
`concurrency_tests` module that runs through `#[concurrency::test]`:
default backend smoke run, `loom::model` under loom, shuttle's
PortfolioRunner under shuttle.

The heavy fuzz loops (`test_concurrency_fib` /
`test_concurrency_fibtable`) stay on std -- their 100k+ packet
iteration counts are TSAN-calibrated, not for per-iteration
model-checking cost.

Iteration counts are tuned per backend via `cfg_select!`: 5 rounds
under loom/shuttle (vs 1000 on std), with a fixed reader/worker budget
under the model checkers so unbounded poll loops don't trip shuttle's
`max_steps` ceiling. `test_packet` is inlined because the original
lives in the std-gated `mod tests` and is invisible under
loom/shuttle.

Add `loom`, `shuttle`, `shuttle_dfs` features to `routing/Cargo.toml`
so the macro-emitted cfg arms resolve to known features.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Bump `concurrency::stress`'s shuttle `Config::stack_size` from the
default 32 KiB to 4 MiB. Shuttle wraps each atomic/lock primitive
with bookkeeping that pushes per-instance size into the 100-byte
range (an `AtomicBool` is ~100 bytes under shuttle), so any
non-trivial atomic-heavy body blows through the default. The
historical workaround was per-call `shuttle::Config` overrides at
1 MiB (notably in NAT's allocator tests).

One number in the dispatcher kills the per-test knob. 4 MiB carries
the heaviest workspace consumer (NAT allocator's per-block atomic
arrays) with headroom; the cost is `N workers * 4 MiB` per stress
iteration, well below CI memory pressure.

`shuttle::Config` is `#[non_exhaustive]`, so written as a mutation
of `Config::default()`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Now that `concurrency::stress` carries a 4 MiB shuttle stack and the
PortfolioRunner runs Random + PCT in parallel, the custom
`shuttle_config` / `run_shuttle_random` / `run_shuttle_pct`
scaffolding in `tests_shuttle` is redundant. Replace with a single
`mod concurrency_tests` that flips each test to `#[concurrency::test]`:

  * `test_concurrent_allocations_two_ips` (was
    `..._without_shuttle`) -- two threads against distinct source IPs;
    smoke run on default backend, full coverage on
    `--features shuttle`.
  * `test_concurrent_allocations_three_workers` (was
    `..._shuttle_random` + `_pct`, collapsed) -- portfolio runs both
    schedulers in one invocation.
  * `test_ensure_shuttle_works` -- gated to model-checker backends
    only; the deliberate race only reaches the failing schedule under
    a real scheduler, the default backend's one-shot run is
    non-deterministic.

Drops the helpers plus the `Arc` / `thread` imports in `std_tests`
that they pulled in. Adds `loom = ["concurrency/loom"]` to
`nat/Cargo.toml` so the macro-emitted cfg arm resolves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Make `just features=loom test` build and run end-to-end across the
workspace. Earlier commits had loom working only on the `concurrency`
crate; everything else failed to compile or crashed at runtime with
stack overflow, Arc leak, or DashMap destructor panics. Bundled into
one commit because each fix was discovered by running the previous
one.

## concurrency

  * `fn sleep(_: Duration)` shim under loom (loom 0.7 doesn't model
    time; yields to the scheduler so the call still acts as a
    schedule point).
  * Enumerate `loom::thread` re-exports and shadow `spawn` with a
    4 MiB default stack. Loom 0.7's default coroutine stack is 4 KiB,
    which overflows trivially under atomic-heavy `concurrency::sync`
    types.
  * `stress` under loom wraps the body in a 4 MiB
    `Builder::spawn` so the main 4 KiB coroutine just spawns and
    joins. Costs one of loom's five thread slots.
  * `loom_scope::Scope::spawn` routes through `super::spawn` for the
    same default.
  * `Slot::load()` / `SlotOption::load()` helpers used by
    `common::cliprovider` to drop a redundant `load_full()` clone.

## nat

  * `cfg_attr`-gate `#![feature(arbitrary_self_types)]`: loom wraps
    `Arc<T>` in a facade newtype that isn't a blessed self-receiver,
    so `self: Arc<Self>` methods on `AllocatedIp` /
    `AllocatedPortBlock` need the unstable feature there.
  * `#[concurrency_mode(loom)]` no-op `shuffle_slice` (loom needs
    determinism for replay; shuffle is allocation-order heuristic,
    not correctness).
  * Gate `concurrency_tests` off loom: the facade's `Weak` shim holds
    a strong clone of the `Arc`, so the allocator's
    `Weak::upgrade().is_none()` liveness signal never fires and
    loom's `Arc leaked` assertion catches it.

## flow-entry

  * Gate `concurrency_tests` to shuttle only: `FlowTable`'s internal
    `DashMap` panics in loom's end-of-execution cleanup (sharded
    `RwLock`s don't fit loom's strict lifecycle accounting).

## routing

  * Gate `fib::test::concurrency_tests` off loom: the `left_right`
    epoch state space is too large for exhaustive search to terminate
    in reasonable time.

## dataplane

  * The binary builds an `Arc<dyn Fn(...) ...>` trait-object closure
    in `packet_processor::setup_internal`, which needs `CoerceUnsized`
    on the concrete `Arc`. Loom 0.7's `Arc` doesn't carry that trait,
    and the facade newtype can't add it. Gate the bin out of loom
    builds: extract the body to `dataplane/src/runtime.rs` and leave
    `main.rs` with a stub `main` under loom that panics if invoked.
    Library crates still get loom coverage through feature
    propagation.
  * `dataplane/src/drivers/dpdk.rs` switches `use crate::CmdArgs` to
    `use args::CmdArgs;` to follow the new module layout.

CI's loom step in `.github/workflows/dev.yml` still scopes to
`--package=dataplane-concurrency` because the loom-incompatible tests
across the workspace are now cfg-gated rather than package-filtered;
the package scope is no longer load-bearing for the test invocation,
only for cargo's feature unification.

Default and `--features shuttle` builds unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Leftover from the superseded dis-guard exploration; nothing references
it. Public-API drift in `concurrency::slot` for no benefit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
The import-only enforcement let `static X: std::sync::LazyLock<T> =
std::sync::LazyLock::new(...)` slip through, and one such site already
existed in `config/src/external/overlay/vpcpeering.rs`. Extend the
rule with a multi-line regex backstop that matches the facade-managed
type names in any expression position, with a leading-comment
lookahead so rustdoc intra-doc links (`/// [std::sync::Arc]`) don't
false-positive. Also expand the grouped-import regex to span multiple
lines.

Convert the offending FQN to `concurrency::sync::LazyLock`. Move the
deliberate `mgmt/tests/reconcile.rs` `nosemgrep:` onto the same line
as the `std::sync::Mutex::new` it suppresses, and annotate the
intentional `std::sync::Arc` in `concurrency::stress` (shared *across*
`loom::model` invocations, so it must remain a std Arc).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
The previous loom step ran `just test concurrency` only, which left
the workspace-wide loom compile (the whole point of the facade's
local `Weak<T>` shim and `Arc::downgrade`) unprotected. Add a
`cargo check`-equivalent step ahead of the test run so a regression
in any consumer crate fails CI directly. Tests stay scoped to the
concurrency crate -- model-checking the whole workspace under loom
is intractable. Update the inline comment to reflect the new reality.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Add the plumbing that lets cargo execute cross-compiled tests under
qemu-user when the host architecture doesn't match the target:

  * `scripts/test-runner.sh` -- thin wrapper.  When `MIRI_SYSROOT` is
    set we delegate to `.cargo-miri-wrapped` so a miri run on a
    matching host arch still goes through the miri interpreter
    instead of running natively under qemu.  Otherwise: native exec
    when target == host, else `qemu-${target_machine}`.
  * `.cargo/config.toml` -- runner entries for the four cross triples
    we ship (x86_64 / aarch64 × gnu / musl).  Explicit triples (not
    cfg-patterns) because cargo miri injects a `cfg(all())` runner
    and refuses to disambiguate between two cfg-pattern matches; the
    explicit form wins method-resolution-style and the script's
    `MIRI_SYSROOT` branch handles the miri case from inside.
  * `default.nix` -- `qemu-user` joins the dev shell so the wrapper
    has `qemu-x86_64` / `qemu-aarch64` on PATH locally and in CI.

No CI behaviour change yet -- this just makes `cargo test --target
<cross-triple>` work via emulation when run by hand.  The CI step
that actually exercises the path lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Add a `test` step to the cross job that runs the full nextest suite
against the cross-compiled archive.  Each cross binary dispatches
through `scripts/test-runner.sh` (registered as the cargo runner in
`.cargo/config.toml`) which delegates to `qemu-${target_machine}`
when the host architecture doesn't match.

Gates:

  * Only runs on `pull_request` runs with `ci:+cross` or
    `ci:+cross/full` on the labels.  push / merge_group cross legs
    stay build-only; ISA emulation pays a real wall-clock cost and
    we don't want to slow the merge queue.
  * Gated to `matrix.recipe.args == 'dataplane'` so the test pass
    isn't duplicated for the `frr.dataplane` row, which differs from
    the `dataplane` row only in the container recipe, not in the
    cross target.

`ci:+cross` (today's "run the cross job at all" label) implies
"include the qemu test pass" -- there is currently no way to
schedule cross/full builds without a label, so the same gate
serves both modes.  Splitting them out (e.g. opting bump.yml into
cross/full automatically) is a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
The test hard-codes `3` as a "non-multiple of `RESULTS_MULTIPLIER`"
which assumes the multiplier is 4 (x86_64).  Under cross-compiled
aarch64 builds the wrapper's `RESULTS_MULTIPLIER` constant binds to
1 -- the validator's "not-a-multiple" branch becomes unreachable
because every positive integer is trivially a multiple of 1, so the
validator returns `Ok` and the test panics on the `matches!` assert.

Upstream DPDK defines `RTE_ACL_RESULTS_MULTIPLIER` as
`XMM_SIZE / sizeof(uint32_t)`, which should be 4 on every supported
ISA.  Either bindgen on the cross sysroot doesn't see the right
`XMM_SIZE` typedef on aarch64 or the cross headers ship a divergent
`rte_vect.h`; needs a follow-up investigation.

Skip on aarch64 with a TODO so the rest of the cross-aarch64 test
surface can run green under qemu-user.  This is the only known
target-specific divergence in the dpdk binding; the rest of the
sweep passes cleanly under musl+qemu.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Introduce a single `emulated` cfg as the umbrella over both
emulation backends in the test surface:

  * `miri.just`: append `--cfg=emulated` to `RUSTFLAGS` so miri
    invocations set it.
  * `nix/profiles.nix`: set `--cfg=emulated` whenever `for-tests` is
    true and the target arch differs from the build host's
    (`is-emulated-test`).  Today this covers cross-arch test builds
    that run under qemu-user on the lab runners.  `--check-cfg=cfg
    (emulated)` is set unconditionally so unused branches don't
    trip `unexpected_cfgs`.
  * `default.nix`: plumb the build host's arch
    (`stdenv.hostPlatform.parsed.cpu.name`) through to
    `profiles.nix` as `host-arch`, so `is-emulated-test` compares
    target against actual host rather than hard-coded \`"x86_64"\`.
  * `.cargo/config.toml`: same `--check-cfg=cfg(emulated)` for the
    native dev build path.

Sweep existing `cfg_attr(miri, ignore)` / `cfg(miri)` / `cfg_select!
{ miri => N }` sites that apply equally to qemu-user:

  * `routing/src/router/rio.rs`, `routing/src/frr/test.rs`,
    `routing/src/atable/resolver.rs`, `cli/src/cliproto.rs`: tests
    that bind Unix domain sockets / read kernel-state files now
    skip under any emulation backend, not just miri.  qemu-user's
    epoll readiness emulation has the same gaps that justified the
    original miri skips.
  * `routing/src/fib/test.rs`, `net/src/packet/hash.rs`,
    `flow-entry/src/flow_table/nf_lookup.rs`,
    `left-right-tlcache/src/lib.rs`, `k8s-intf/src/bolero/support
    .rs`, `config/src/utils/collapse.rs`: per-arm iteration counts
    in `cfg_select!` are now keyed by `emulated` rather than `miri`.
    qemu-user runs at ~5-10x slowdown vs native and the original
    miri count (which targeted miri's much steeper slowdown) is
    still a sensible upper bound for qemu-user too.

Pair this with two CI/runtime side changes:

  * `.config/nextest.toml`: new `cross-qemu` nextest profile sets a
    `slow-timeout = { period = "60s", terminate-after = 5 }` so a
    qemu hang gets killed and surfaces as a TIMEOUT in the report
    instead of wedging the whole run.  `fail-fast` is permissive
    so we collect the full list of qemu-affected tests in one go.
  * `.github/workflows/dev.yml`: cross-job `test` step is
    restricted to `matrix.libc == 'musl'` because gnu's libgcc_s
    unwinder mis-handles qemu-user's emulated signal frames on
    aarch64 (every panic-unwind path turns into SIGABRT).  musl's
    LLVM libunwind walks those frames correctly, so musl legs run
    the full suite clean; gnu legs stay build-only until/unless we
    fix the gnu unwinder story.
  * `.gitignore`: ignore `**/qemu_*.core` so the SIGABRT cores
    qemu drops when running gnu cross binaries locally don't show
    up in `git status`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
It is much too slow to print to stdout like this under
qemu-user or miri.

Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Two halves of one logical change:

  * `.github/workflows/dev.yml`: extend the cross job's `if:` to fire
    on `ci:+cross/full` in addition to `ci:+cross`, so a PR labeled
    only with `ci:+cross/full` (no `ci:+cross`) still gets cross.
  * `.github/workflows/bump.yml`: auto-apply `ci:+cross/full` to the
    weekly cargo-upgrades PR.

The weekly cargo-upgrades PR is the right place to catch cross-arch
regressions introduced by transitive dep churn (a crate dropping an
aarch64 target, changing alignment, etc.).  Without an opt-in label
the cross job stays build-only on PR runs, so we add
\`ci:+cross/full\` alongside the existing \`automated\` and
\`dependencies\` labels.  Today that label triggers the qemu-user
test step on the existing 4-leg cross matrix; once the matrix
scope split lands, the same label will expand the matrix to the
full hardware x libc sweep without further changes here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Cross stays advisory (`continue-on-error: true`) on every PR, push,
and merge_group run, so leg flakes never block merge.  That's the
right default for interactive PRs and the merge queue, but it makes
genuine cross failures easy to miss on the weekly cargo-bump PR
(auto-labeled `ci:+cross/full`), where catching upstream-driven
aarch64 regressions is the whole point.

Add a sticky-comment surfacing step:

  * New composite action `.github/actions/sticky-pr-comment` --
    create-or-update a PR comment keyed by an HTML-comment marker
    (so subsequent runs find and update the same comment instead of
    spamming the thread).

  * New steps in the `summary` job that read the cross matrix's
    real per-leg conclusions from the Actions REST API (via
    `actions/github-script`) and, if any leg failed, post the
    sticky comment with a link back to the failing run.  The
    summary job grows `pull-requests: write` (for the comment) and
    `actions: read` (for the API call).

    We can't gate on `needs.cross.result` here: at the job level,
    `continue-on-error: true` makes that always report `success` to
    dependents, which is exactly the property we want for merge
    gating but which also hides the failure from this step.  Reading
    the underlying job results sidesteps the masking.

The alternative -- making cross blocking when `ci:+cross/full` is
present -- creates an attribution trap: a non-labeled PR that
silently regresses cross would land on main, and the next
cross/full-labeled bump PR would inherit the broken main and read
as "the bump caused the regression."  A sticky comment gives
reviewers a loud in-PR signal without that misattribution risk and
preserves the "leg-flake doesn't block merge" property uniformly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Signed-off-by: github-actions[bot] <224724778+hedgehog-dataplane-update[bot]@users.noreply.github.com>
Signed-off-by: github-actions[bot] <224724778+hedgehog-dataplane-update[bot]@users.noreply.github.com>
name             old req compatible latest new req
====             ======= ========== ====== =======
netgauze-bgp-pkt 0.12.0  0.12.1     0.12.1 0.12.1

Signed-off-by: github-actions[bot] <224724778+hedgehog-dataplane-update[bot]@users.noreply.github.com>
name             old req compatible latest new req
====             ======= ========== ====== =======
netgauze-bmp-pkt 0.12.0  0.12.1     0.12.1 0.12.1

Signed-off-by: github-actions[bot] <224724778+hedgehog-dataplane-update[bot]@users.noreply.github.com>
name       old req compatible latest  new req
====       ======= ========== ======  =======
serde_json 1.0.149 1.0.150    1.0.150 1.0.150

Signed-off-by: github-actions[bot] <224724778+hedgehog-dataplane-update[bot]@users.noreply.github.com>
Drop manual_assert_eq, redundant references in formatting, and
implicit format-arg captures flagged under `-D warnings`. `#[allow(dead_code)]` the unused egress helpers (kept for the future
egress NF rework). Precursor for the threading rewrite stack so
`just lint` is green throughout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
The DPDK driver was stubbed out behind `#![cfg(feature = "dpdk")]` and
the dispatch in `runtime.rs` already routes the "dpdk" argument through
`todo!()`. Remove the empty `drivers/dpdk` module, the `dpdk` cargo
feature, the `dpdk-sysroot-helper` build dependency, and the now-empty
`build.rs`.

The DPDK driver supervisor will be re-introduced from scratch when the
DPDK threading model lands; the present scaffolding is misleading
because it implies the driver exists in some form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Add a `dataplane-lifecycle` crate with `Shutdown` and `Subsystem`
primitives, signal-handler installation, and a process-wide shutdown
watchdog.

`Shutdown` bundles a root `CancellationToken` and one `Subsystem` per
long-lived component (workers, router, mgmt, metrics). Each subsystem
exposes a per-subsystem cancel token, a `TaskTracker`, and a shared
fatal flag. Subsystems drain in topological order with per-subsystem
deadlines; the detached watchdog enforces an absolute upper bound on
total shutdown duration.

No consumers yet -- wired up in follow-on commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Plumb lifecycle Subsystems into the routing crate as the first step of
the threading rewrite. The rest of `main`'s shutdown signaling (ctrlc +
mpsc<i32> + start_mgmt + MetricsServer + old DriverKernel::start) stays
in place; follow-on commits migrate each.

- `Router::new` takes `(mgmt, mgmt_handle, router)` Subsystems +
  runtime handle. Plumbed through `packet_processor::start_router`.
- `start_rio` takes `&Subsystem`; the IO loop observes its cancel
  between poll cycles (worst-case exit latency = 1s poll). Adds an
  ExitGuard so panic-unwind or unexpected loop exit reports fatal.
- `RouterCtlMsg::Finish` removed; `RioHandle::finish` becomes idempotent.
- `bmp::spawn_background` spawns onto the caller-provided runtime handle
  tracked under `mgmt`; no more leaked runtime.
- `runtime.rs` builds a multi-thread mgmt runtime (only BMP tenants it
  in this commit) and a `Shutdown` for plumbing into Router::new.
- `dataplane` and `mgmt` Cargo.toml gain `lifecycle` dep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
daniel-noland and others added 3 commits May 23, 2026 15:17
Move mgmt + metrics from dedicated OS threads with private runtimes to
tenants of the multi-thread mgmt runtime introduced in the prior commit.
The kernel driver still uses the legacy ctrlc + mpsc<i32> path; the
final unification commit migrates that.

- `run_mgmt` (renamed from `start_mgmt`): synchronous init on the
  caller-provided handle, then spawns the three long-lived tasks (config
  processor, status updater, config watcher) via
  `Subsystem::spawn_fatal_on_exit` so their unexpected exit flips fatal.
  Init observes `mgmt.root_token()` so SIGINT during k8s retries returns
  `LaunchError::Cancelled` within cancel latency.
- `LaunchError::Cancelled` is a clean-shutdown signal; the call site in
  `runtime.rs` forwards the existing mpsc stop channel with code 0 so
  the legacy shutdown path stays consistent.
- `spawn_metrics` replaces `MetricsServer`: HTTP endpoint, upkeep ticker,
  and stats collector all spawn onto `mgmt_handle` tracked under
  `metrics`. Uses plain `spawn_on` (not `spawn_fatal_on_exit`) — a dead
  metrics endpoint should not take down the dataplane.
- Drop stale `LaunchError` variants no longer constructed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Replace the last legacy shutdown signaling (ctrlc handler + mpsc<i32>
exit code + dedicated controller thread) with the single
`lifecycle::Shutdown` path, and migrate the kernel driver to scoped
threads with cancellation observation. After this commit there is one
signaling path: SIGINT/SIGTERM -> shutdown.root, or any subsystem's
report_fatal -> shutdown.root, with the watchdog as the absolute
upper bound.

- `main`: install `spawn_signal_handler` and `spawn_shutdown_watchdog`,
  run everything inside `concurrency::thread::scope`, block on
  `root.cancelled()`, then drain subsystems in canonical order
  (workers -> router -> metrics -> mgmt). Exit code from `is_fatal()`.
- `DriverKernel::start`: takes `&Scope` and `&Subsystem`; workers spawn
  via `spawn_scoped` with an `ExitGuard` Drop pattern that reports
  fatal on panic-unwind, early `?`-return, and unexpected normal exit.
  Reader loops observe cancel between reads. Supervisor joins-and-logs.
- Drop `dataplane/src/drivers/tokio_util.rs` and its
  `run_in_local_tokio_runtime` helper (inlined where needed).
- Drop `ctrlc` and `mio` from `dataplane` dependencies; drop `ctrlc`
  from the workspace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Introduces two new crates, scoped narrowly to "1Hz ACL updates" so we
can develop and experiment with the cascade and update logic without
the 6-10k-line LSM avalanche described in the rfc-lsm branch.

`dataplane-cascade`: head + frozen[] + tail with multi-writer head
absorption via the `Upsert` trait (commutative+associative folding).
Slot-published `Cascade` with `rotate` (freeze head, install new) and
`compact` (fuse frozen into tail via `MergeInto`). `Snapshot` walks
head -> frozen[] -> tail and stops at the first definitive answer.
`Generation` newtype is in place; `lookup_at` / `compact_through`
exist but aren't load-bearing yet at 1Hz update rate. `DrainEvent` /
`subscribe` are behind a feature flag, off by default.

`dataplane-acl`: software ACL classifier wrapping the cascade.
`AclHead` is a `Mutex<BTreeMap<Priority, AclRule>>`; `AclFrozen` is a
priority-sorted Vec; `AclTail` is structurally identical (room to
swap a richer representation later). Compaction merges sealed layers
into the tail with priority dedup.

Deferred for follow-up PRs (already worked out on rfc-lsm but not
shipping yet): mat facade crate, mat-runtime (PolicyGenAllocator,
ManagedCascade), mat-state-sync (PeerDedup), pressure regimes /
try_reset, FlowOrigin / HasOrigin, conntrack-shape cascade.

See `.scratch/lsm.md` and `.scratch/mat-pipeline-rfc/0001-mat-pipeline.md`
for the broader design this is a thin slice of.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@daniel-noland daniel-noland force-pushed the pr/daniel-noland/threading-rewrite branch 6 times, most recently from 800f5dc to 2021414 Compare May 31, 2026 03:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant