Skip to content

Speed up flow tables and reclaim their memory#995

Open
MorganaFuture wants to merge 2 commits into
oxidecomputer:masterfrom
MorganaFuture:ft-hashbrown
Open

Speed up flow tables and reclaim their memory#995
MorganaFuture wants to merge 2 commits into
oxidecomputer:masterfrom
MorganaFuture:ft-hashbrown

Conversation

@MorganaFuture
Copy link
Copy Markdown

@MorganaFuture MorganaFuture commented May 28, 2026

Two related improvements toward #779, both internal to FlowTable.

Faster lookups. Swaps the BTreeMap backing FlowTable for a hashbrown::HashMap, giving O(1) lookups at the flow counts front-facing guests reach in practice. The benchmarks gathered on #779 put foldhash gets at ~33ns versus ~316ns for BTreeMap at ~4M entries. Each table seeds its own foldhash state from the kernel PRNG (random_get_pseudo_bytes) so the hash sequence can't be predicted and ground into worst-case collisions by remote traffic; std/test builds use a fixed seed for reproducibility. dump now sorts its output so flow listings (and the opteadm view built from them) stay stable despite the unordered map.

Reclaim memory after churn. A HashMap keeps its high-water-mark allocation after entries are removed, so a churn spike would pin host memory long after the flows went away. The periodic expiry pass now shrinks a table toward 2 * len once it drains to under a quarter full. Firing only on a deep drain keeps tables near capacity from thrashing and bounds how often the rehash runs under the port lock. This is the "scale down" half of #779; growth is already handled by the map's own resizing up to the table limit.

This deliberately leaves the capacity limits and eviction policy untouched, since eviction/lifecycle is being reworked separately.

Tested: cargo test -p opte and -p oxide-vpc pass; the no_std kernel build (--features engine,kernel) type-checks on the pinned nightly. I have not linked or run the xde kmod on a Helios host.

Replace the `BTreeMap` backing `FlowTable` with a `hashbrown::HashMap`
for O(1) lookups at the flow counts front-facing guests reach in
practice. At ~4M entries the benchmarks on oxidecomputer#779 show foldhash gets
around 33ns versus ~316ns for `BTreeMap`.

Each table seeds its own `foldhash` state from the kernel PRNG so a
remote party cannot predict the hash sequence and force pathological
collisions; under std/test we use a fixed seed for reproducibility.

`dump` now sorts its output so flow listings (and the `opteadm` output
built from them) stay stable despite the unordered map.
A `HashMap` keeps its high-water-mark allocation after entries are
removed, so a churn spike would pin host memory long after the flows
went away. Reclaim it on the periodic expiry pass: once a table drains
to under a quarter full, shrink toward `2 * len` (with a small floor).

Firing only on a deep drain keeps tables that sit near capacity from
thrashing and bounds how often the rehash runs under the port lock.

This is the "scale down" half of oxidecomputer#779; growth is already handled by the
map's own resizing up to the table limit.
@MorganaFuture MorganaFuture changed the title Back flow tables with a seeded hashbrown HashMap Speed up flow tables and reclaim their memory May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant