Skip to content

Add tbf (token bucket filter) qdisc to support egress traffic shaping / rate limiting#13104

Open
benldrmn wants to merge 1 commit into
google:masterfrom
benldrmn:feat/network-traffic-shaping
Open

Add tbf (token bucket filter) qdisc to support egress traffic shaping / rate limiting#13104
benldrmn wants to merge 1 commit into
google:masterfrom
benldrmn:feat/network-traffic-shaping

Conversation

@benldrmn
Copy link
Copy Markdown
Contributor

@benldrmn benldrmn commented May 6, 2026

resolves the egress part of #11109.

AI usage disclosure: used ai to help me generate some tests and cleanups after the manual implementation I did by hand modeled after linux/net/sched/sch_tbf.c TBF implementation. Also used it in guided documentation writing. I understand every line of code written and wrote the core logic manually, and happy to answer questions / revisit parts of the implementation as necessary.

Also tested and benchmarked manually against a local Kubernetes kind cluster to ensure it resolves isola-run/isola#290

Implements a single-rate TBF qdisc modeled on Linux's net/sched/sch_tbf.c
and exposes it via --qdisc=tbf, with required --qdisc-tbf-rate and
--qdisc-tbf-burst flags. OCI annotations can lower the configured rate
and burst ceilings but not raise them without --allow-flag-override.

The fifo qdisc's circular packet-buffer list moves into a shared
pkg/tcpip/link/qdisc package so both qdiscs share one implementation.
Loopback and ingress traffic are not shaped.
@benldrmn benldrmn force-pushed the feat/network-traffic-shaping branch from 996d5fb to b1a787e Compare May 6, 2026 18:17
@parth-opensrc parth-opensrc self-assigned this May 6, 2026
@EtiennePerot
Copy link
Copy Markdown
Collaborator

Can you show benchmark results?

@benldrmn
Copy link
Copy Markdown
Contributor Author

benldrmn commented May 8, 2026

@EtiennePerot sure.
Ran them on my laptop (so YMMV) with Intel(R) Core(TM) Ultra 7 155H CPU, 20 iterations, controlled the power governor and thermals (waited between runs until the cpu temperature went back to baseline) to avoid throttling affecting the measurements too much.

Used iperf client inside the sandbox and the server outside (in the same kind cluster), host gso enabled, qdisc-tbf-rate = 100000000000 (100 Gbps), qdisc-tbf-burst = 134217728 (128 MiB), 20 iterations for each setup:

qdisc streams gbps sandbox cpu%
fifo 1 44.06 ± 0.31 Gbps 394.7 ± 1.1 %
tbf 1 43.98 ± 0.44 Gbps 393.7 ± 1.7 %
none 1 23.67 ± 0.33 Gbps 223.0 ± 0.6 %
fifo 4 68.38 ± 3.42 Gbps 758.1 ± 58.3 %
tbf 4 43.78 ± 0.76 Gbps 414.7 ± 4.0 %
none 4 46.33 ± 0.08 Gbps 396.8 ± 1.7 %

similar, but 12 iterations per setup (building up to 32 concurrent streams):

streams fifo (Gbps) none (Gbps) tbf (Gbps)
1 44.7 ± 0.8 23.8 ± 0.1 44.5 ± 0.8
2 73.5 ± 0.4 39.6 ± 0.2 46.9 ± 0.2
4 74.3 ± 3.0 46.4 ± 0.1 43.7 ± 0.6
8 53.3 ± 1.5 57.8 ± 4.2 39.7 ± 0.6
16 52.6 ± 3.0 51.9 ± 1.6 31.1 ± 0.3
32 54.5 ± 2.0 56.2 ± 1.8 32.2 ± 0.2

bounding the client pace to more realistic numbers - iperf client is sending in target pace (5 iterations for each setup):

target qdisc sandbox cpu%
10 Mbps tbf 3.47 ± 0.13
10 Mbps fifo 3.61 ± 0.12
10 Mbps none 3.40 ± 0.09
100 Mbps tbf 7.23 ± 0.10
100 Mbps fifo 7.23 ± 0.06
100 Mbps none 7.48 ± 0.07
250 Mbps tbf 13.17 ± 0.48
250 Mbps fifo 13.08 ± 1.09
250 Mbps none 13.18 ± 0.72
500 Mbps tbf 19.01 ± 0.42
500 Mbps fifo 19.36 ± 0.70
500 Mbps none 19.04 ± 0.87

// psched_ratecfg_precompute__ in net/sched/sch_generic.c.
func len2TimeNS(rate uint64, len uint32) uint64 {
const nsecPerSec = 1000000000
return uint64(len) * nsecPerSec / rate
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get rid of this division by precalculating mul and shift params in New, for the "cost" of complicating the code a bit (the precalculation and storing those less intuitive mul and shift for later use in len2TimeNS)

@@ -0,0 +1,109 @@
// Copyright 2022 The gVisor Authors.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved it from the fifo package so I can reuse the logic in tbf, I don't know why git recognizes it as "new" code. The diff is basically exporting the relevant symbols and adding a peek method

// +checklocksignore: we don't have to hold locks during initialization.
func New(lower stack.LinkEndpoint, clock tcpip.Clock, rate uint64, burst, queueLen uint32) (stack.QueueingDiscipline, error) {
if rate == 0 {
return nil, fmt.Errorf("qdisc=tbf requires setting qdisc-tbf-rate")
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should I use some tcpip error in this file instead of returning fmt.Errorfs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Egress Traffic Shaping

3 participants