refactor(header): replace IsEmptyBlock with fast IsZero short-circuit#2547
refactor(header): replace IsEmptyBlock with fast IsZero short-circuit#2547ValentaTomas merged 2 commits intomainfrom
Conversation
PR SummaryMedium Risk Overview Reviewed by Cursor Bugbot for commit 88dfb98. Bugbot is set up for automated code reviews on this repo. Configure here. |
There was a problem hiding this comment.
Small, well-tested refactor of a leaf hot-path utility — only finding is a doc-comment nit, safe to land.
Extended reasoning...
Overview
This PR refactors header.IsEmptyBlock (per-block hot path of the rootfs diff scan) in packages/shared/pkg/storage/header/diff.go plus adds a new diff_test.go. The substantive changes are: (1) a new IsZero helper that samples 3 bytes (first/last/middle) before falling back to a SIMD-accelerated self-equality compare, (2) deletion of the now-unused EmptyBlock global, (3) explicit length validation in IsEmptyBlock, (4) updated doc comments. About 40 lines net added, all in a self-contained leaf utility.
Security risks
None. This is pure CPU-side logic over already-loaded buffers — no auth, no I/O, no untrusted input parsing. The new length check actually tightens the API surface (callers can no longer silently pass a wrong-shape buffer). The deletion of EmptyBlock is verified safe: grep shows no remaining references after the patch.
Level of scrutiny
Low. Tight scope, behavior-preserving by intent, and the correctness argument for IsZero is straightforward to verify: for n>3, the 3-byte sample forces b[0]==0, and bytes.Equal(b[:n-1], b[1:]) then transitively requires all adjacent bytes to be equal, so all bytes must be zero. For n≤3, the sample alone covers every index. The new tests exercise nil, len≤3, full-block zero, and non-zero bytes hidden between sample positions. Length-mismatch and unsupported-block-size paths are also tested.
Other factors
The single bug finding is a doc-comment nit: the new comment for EmptyHugePage references Build.ReadAt, but the actual consumer is File.Slice at build.go:107 (verified via grep — no Build.ReadAt exists in this repo). Doc-only, no runtime impact, doesn't justify blocking. The PR description carries the same minor inaccuracy. Otherwise the change is mechanical and well-motivated.
a44d240 to
11a2351
Compare
15ccb7d to
eccf472
Compare
|
Trimmed |
b00cc57 to
42bd9c7
Compare
IsEmptyBlock is the per-block hot path of DiffMetadataBuilder.Process, called once per snapshot block during DirectProvider.exportToDiff. Two small wins: - Always paid a full SIMD memcmp (4 KiB or 2 MiB on hugepages) before it could short-circuit. Borrow QEMU's buffer_is_zero trick: sample three bytes (first, last, middle) so most non-zero blocks reject from a single cache line. - Drop the unused EmptyBlock global; only the header itself referenced it. EmptyHugePage stays — build.go uses it as a literal zero buffer for nil-build reads. The fallback uses bytes.Equal on b shifted by one (b[1:] == b[:n-1]) which dispatches to the runtime's SIMD-accelerated memequal on amd64/arm64, same speed as the old comparison against a static buffer. Also adds a length check to IsEmptyBlock so a too-small/too-large buffer returns an explicit error instead of silently reading zero.
42bd9c7 to
7743341
Compare
Strip verbose doc comments and inline narration across the NBD/cache diff-shrinking change. Drop redundant dispatch tests (kept the two that actually exercise WRITE_ZEROES routing) and simplify mockProvider to the surface the remaining tests use. Sync diff.go/diff_test.go/metadata.go with #2547 so the duplicated content matches once that PR rebases in. No behaviour changes.
"iff" is logic shorthand, not a typo, but it keeps tripping readers in a plain-English code comment. Replace with "exactly when".
IsEmptyBlockwas the per-block hot path ofDiffMetadataBuilder.Process— one call per block during the rootfs scan inDirectProvider.exportToDiff. It paid a full SIMDmemcmp(4 KiB or, on hugepages, 2 MiB) before it could short-circuit, and required two static zero buffers (EmptyBlock,EmptyHugePage) to compare against.Replace it with a small
IsZerohelper that uses QEMU'sbuffer_is_zerotrick: sample three bytes (first, last, middle) so most non-zero blocks reject from a single cache line. The fallback usesbytes.Equalonbshifted by one (b[1:] == b[:n-1]), which dispatches to the runtime's SIMDmemequalon amd64/arm64 — same speed as the old comparison against a static buffer when the short-circuit doesn't fire.DiffMetadataBuilder.ProcesscallsIsZerodirectly (the size validationIsEmptyBlockdid was unreachable in practice).EmptyBlockis removed;EmptyHugePagestays —build.gouses it as the literal zero buffer foruuid.Nilreads inBuild.ReadAt.Pre-factored from #2546 so it can land on its own.