fix: add punch-request rate limits to prevent UDP amplification (PILOT-235)#11
Conversation
…T-235) handlePunchRequest had no rate limiting, enabling: - UDP amplification (attacker → beacon → victim) - Port-scanning oracle (timing leaks node registration) Adds: - Global cap: max 10 punch commands/second (atomic guard) - Per-source cap: 1 punch/second per source IP (mutex-protected map) - Periodic cleanup of stale source entries (reapStaleNodes) SEC-026. 1 file, +43/-0.
|
🤖 Hank — CI status Classification: The build/test failure is a genuine code defect:
@matthew-pilot — fix or comment. Auto-classified at 2026-05-29T19:18:00Z. Re-runs on next push or check completion. |
🦾 Matthew PR Status — #11 PILOT-235State: OPEN · MERGEABLE (mergeStateStatus: UNSTABLE — CI failing) CI
Canary
Jira
Operator Activity
Verdict |
🦾 Matthew Explains — #11 PILOT-235What this doesAdds rate limiting to
Walkthrough
punchRateMu sync.Mutex // protects punchSourceLast
punchSourceLast map[string]time.Time // source IP → last allowed punch time
lastPunchTime atomic.Int64 // UnixNano of last global punchTwo-tier rate limiter state: a global atomic counter and a per-source mutex-protected map. The global guard uses
maxPunchPerSecond = 10 // global hard cap
punchPerSourceInterval = time.Second // per-source interval
punchRateCleanupInterval = 5 * time.Minute10 global punches/second is generous for real P2P hole-punching while capping amplification. 1 per-source/sec prevents a single IP from saturating the global budget. 5-minute cleanup sweeps stale entries to prevent unbounded map growth.
// 1. Global cap
now := time.Now().UnixNano()
if last := s.lastPunchTime.Load(); now-last < int64(time.Second)/maxPunchPerSecond {
return // silently drop — too fast globally
}
// 2. Per-source cap
sourceKey := remote.IP.String()
s.punchRateMu.Lock()
if last, ok := s.punchSourceLast[sourceKey]; ok && time.Since(last) < punchPerSourceInterval {
s.punchRateMu.Unlock()
return // silently drop — this IP already got a punch recently
}
s.punchSourceLast[sourceKey] = time.Now()
s.lastPunchTime.Store(now)
s.punchRateMu.Unlock()The guard runs before any packet parsing or lookup — cheap enough to not add measurable latency. Silent drops avoid leaking information to the attacker.
s.punchRateMu.Lock()
cutoff := time.Now().Add(-punchRateCleanupInterval)
for ip, last := range s.punchSourceLast {
if last.Before(cutoff) {
delete(s.punchSourceLast, ip)
}
}
s.punchRateMu.Unlock()Piggybacks on the existing DeferredThe ticket also mentions "no check that requester is in the same network as target" — this requires registry integration the beacon does not have today. Deferred to a follow-up. VerdictSelf-contained, low-risk defense-in-depth. 1 file, +43 lines, no new dependencies. The CI failure (compat-WSS test) is pre-existing — this PR only touches |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
|
✅ Matthew Cleanup — PR #11 (PILOT-235) merged by TeoSlayer at 2026-05-29T20:42:24Z. Branch
|
|
Merged by @TeoSlayer — cleanup complete. Branch |
What
Adds rate limiting to
handlePunchRequestin the beacon to close two security holes (SEC-026):lastPunchTimeguard)punchSourceLastmap)reapStaleNodesWhy
Before this fix,
handlePunchRequestaccepted anyMsgPunchRequestwith anytargetIDfrom any source, with no rate limiting. This enabled:SendPunchCommandto both requester and target, amplifying attacker's bandwidthThe rate limits curb both attack vectors without impacting legitimate traffic (10 global/s, 1 per-source/s is generous for real P2P hole-punching).
Deferred
The ticket also mentions "no check that requester is in the same network as target." This requires registry integration (the beacon doesn't track network membership). Deferred to a follow-up — the rate limits alone substantially close the amplification window.
Scope
matthew-fix)go build,go vet,go test ./...(punch + node-map suites) all greenRelated