Waf challenge mode startup perf#4458
Conversation
…rtup blocker NewChallengeRuntime previously generated one obfuscated challenge bundle synchronously before returning, blocking service startup by ~1 minute on "high-obfuscation". This shifts that work to build time: - New `go generate` step runs `cmd/initialbundle` which substitutes the runtime path placeholders into fpscanner/bundle.js, runs the existing obfuscator WASM via wazero, gzips the result, and writes it to pkg/appsec/challenge/initial_bundle.js.gz. - challenge.go embeds the gzipped bundle and seeds the cache from it on startup. The background variant generator continues to add and rotate fresh runtime-generated variants on the normal refresh interval. - If the baked-in bundle is missing/corrupt (e.g. `go generate` not run), fall back to the previous synchronous generation path. - The wazero module is now compiled once via CompileModule and instantiated per call instead of decoded per call (best-practice). Result: NewChallengeRuntime returns in ~600ms instead of ~60s. First request is served from the baked-in variant; runtime-generated variants take over as they become available. Also adds a build-tagged feasibility benchmark (`-tags=feasibility`) used to size the optimization, and a startup-budget regression test.
…ared secret The challenge runtime hardcoded `const masterSecret = "SUPER_SECRET_KEY"` (a `// FIXME`), which both leaked a placeholder secret in source and prevented distributed deployments where multiple WAF instances must agree on signed tickets and sealed cookies. Changes: - New `WithMasterSecret([]byte)` functional option on NewChallengeRuntime; defaults to a freshly-generated random 32-byte secret when omitted, with a startup warning that distributed setups MUST configure a shared value. - `ParseConfiguredSecret(string)` accepts hex-encoded bytes (preferred) or a raw passphrase; minimum 32 bytes either way. - The acquisition module config exposes `challenge_master_secret` and plumbs it into the runtime. - `computeTicket`, `computePowMAC`, and `matchesChallenge` are now methods on ChallengeRuntime that use the per-instance secret instead of a package-level constant. - `sealCookie` / `openCookie` / `deriveKey` accept `[]byte` for the secret, matching the new representation. - Tests use a fixed test secret via a small newTestRuntime helper. New TestDistributedAgreement verifies that two runtimes with the same secret produce bit-identical tickets/MACs and that a challenge issued by one validates against the other; existing TestMatchesChallenge picks up a cross-secret rejection check. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@buixor: There are no 'kind' label on this PR. You need a 'kind' label to generate the release automatically.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
|
@buixor: There are no area labels on this PR. You can add as many areas as you see fit.
DetailsI am a bot created to help the crowdsecurity developers manage community feedback and contributions. You can check out my manifest file to understand my behavior and what I can do. If you want to use this for your project, you can check out the BirthdayResearch/oss-governance-bot repository. |
Adds time-based key rotation built on HKDF derivation from the shared
master_secret. Two instances configured with the same master_secret and
rotation_interval derive bit-identical per-epoch keys for the same
epoch, so the rotation is automatic and stateless across a load-balanced
fleet.
KeyRing
- New pkg/appsec/challenge/keyring.go.
- Epoch identifier = floor(now.Unix() / rotation_interval.Seconds()).
- Per-epoch sign and cookie keys derived via HKDF-SHA256 with stable
salt "crowdsec-challenge-keyring-v1" and per-context info strings
("epoch-sign", "epoch-cookie") so the same secret can produce two
cryptographically independent keys for the same epoch.
- Sliding live window: any epoch in
[current - maxLive + 1 ... current + clockSkew] is acceptable;
anything outside is rejected.
- Internal cache of derived keys, eviction of stale epochs on every
derivation.
Wiring
- ChallengeRuntime now holds *KeyRing instead of a flat masterSecret.
- computeTicket and computePowMAC sign with the epoch derived from the
ticket's timestamp, so verification needs no extra wire bits.
- matchesChallenge looks up the per-epoch sign key, returns false on
out-of-window epochs (also defends against forged stale timestamps).
- New options WithRotationInterval and WithMaxLiveEpochs; defaults are
5-minute rotation and a 3-epoch live window.
Cookie format v1
- New crypto.go format: version_byte || epoch_be8 || nonce ||
ciphertext. Epoch is also bound into the AEAD AAD so a sealed
cookie cannot be replayed under a different epoch tag.
- v0 (legacy) cookies without the version byte fall back to a
try-decrypt loop over every live epoch, so cookies issued just
before the upgrade keep working until they expire (default 2h
cookie TTL). After that window the fallback path is only taken on
adversarial input.
- ErrCookieEpoch is a typed sentinel for "out-of-window epoch".
Acquisition config
- challenge_key_rotation_interval (duration) and
challenge_max_live_epochs (int) fields added; both must agree across
instances in a distributed setup.
Tests
- KeyRing: determinism across instances, rotation at boundary, live-
window admission, stale-cache eviction, known-vector lockdown,
cross-context and cross-epoch separation.
- Cookie v1: round-trip, AAD epoch binding (tampering invalidates),
out-of-window rejection, UA mismatch rejection.
- Rotation end-to-end: ticket signed under epoch N validates after the
keyring rolls to N+1 (in-flight requests survive); ticket from an
evicted epoch is rejected; full ValidateChallengeResponse +
ValidCookie round-trip exercises both keyring and cookie-v1
together.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ide obfuscation Before this change, the challenge JS payload was rendered together with a plain <script>var _t="…",_ts="…",…</script> tag that embedded the per- request ticket (HMAC of the timestamp under the master secret). An attacker scraping the HTML could capture _t and _ts, recompute the session key (sha256(ticket + nonce)) and submission HMAC, and forge a "legitimate browser" signature without ever executing the obfuscated bundle. The fix is a split-bundle protocol: - Static bundle (initial_bundle.js.gz, baked at `go generate`): the fingerprint scanner + crypto primitives + PoW driver. Exposes a registration on globalThis["__CSEC_CHALLENGE_HOOK_v1__"]. The hook name string is registered in obfuscate.js via the `reservedStrings` option so it survives the high-obfuscation preset's string-array transform identically in both the static bundle and the dynamic module — that's how they meet at runtime. - Dynamic key module (dynamic_module.js.tmpl, obfuscated per epoch): ~30 lines that carry the per-epoch HMAC key as a hex literal and invoke the static bundle's hook. The obfuscator's string-array transform encodes the key bytes; the hex literal does NOT survive in plain form (split_bundle_test.go enforces this). - Protocol shift: the server no longer issues a ticket via a plain <script> tag. The HTML template only carries non-secret per-request values (_powD/_powP/_powM/_ts). The client computes ticket = HMAC(ts, K_epoch) inside the obfuscated bundle, where K_epoch comes from the dynamic module. _powP/_powM stay server-issued because their integrity guarantee depends on the client not being able to pick favourable PoW salts. Other notable bits: - buildAndObfuscateDynamicModule caches obfuscated modules per epoch; prunes any cached entries whose epoch has fallen out of the keyring's live window. NewChallengeRuntime pre-warms the dynamic module for the current epoch so the very first GetChallengePage call doesn't pay the obfuscation cost on the request-serving path. - Server-side wire format is unchanged: the client still POSTs the same fields (f, t, ts, h, n, p, m). matchesChallenge already verifies the ticket the same way it verifies a server-issued one — both derive HMAC(ts, K_epoch) — so the ValidateChallengeResponse path needed no behaviour change. - The vendored javascript-obfuscator and the fpscanner bundle that's fed into it were regenerated via `go generate ./...` so the embedded artifacts match the current sources. Tests: - TestSplitBundle_HookSentinelInBakedBundle — the static bundle contains a literal occurrence of the hook sentinel (regression guard for accidental removal of the reservedStrings registration). - TestSplitBundle_DynamicModuleObfuscatesKey — the dynamic module contains the hook sentinel but does NOT contain the per-epoch key in plain hex (security regression guard). - TestSplitBundle_DynamicModuleCachedPerEpoch — repeated calls in the same epoch return the cached module byte-for-byte. - TestSplitBundle_DynamicModuleRebuildsOnEpochAdvance — rotation produces a fresh module. - TestSplitBundle_HTMLDoesNotContainSecret — the most important invariant: the per-epoch sign key MUST NOT appear in plain hex in the served HTML. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…neration - Add `make generate-challenge-js`, an opt-in target that runs `go generate ./pkg/appsec/challenge/js/...`. Gated on `javy` being on PATH so contributors get a clear error message instead of an obscure exec-not-found mid-pipeline. Not part of the default `make build` flow because the generated artifacts (initial_bundle.js.gz, obfuscate/index.wasm.gz, fpscanner/bundle.js) are committed. - Add pkg/appsec/challenge/js/README.md describing what each pipeline step does, what is committed vs generated, when to regenerate, what tools are required (only `javy`; build-time only, not a runtime dependency), and the sentinel-survival contract that ties the static bundle to the dynamic key module. - Pin the keyring clock in TestSplitBundle_DynamicModuleCachedPerEpoch. testKeyRing uses a 1-minute rotation interval, but each buildAndObfuscateDynamicModule call takes ~12s; without pinning, two calls could straddle a rotation boundary and produce different cached entries, causing a flaky failure. Pinning makes the test deterministic. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| // Cookie wire format: | ||
| // | ||
| // v1 (current): cookieVersionV1 || epoch_be8 || nonce || ciphertext | ||
| // v0 (legacy): nonce || ciphertext |
There was a problem hiding this comment.
kill v0 compat it hasn't been released
No description provided.