IOR trait#22
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
|
NEON: 65% reduction in twin_constraint_sumcheck achieves 30% reduction in overall protocol time.
|
Extracts WARP::prove and ::verify into per-phase modules under src/protocol/phases/ (pesat, twin_constraint, ood, batching, proximity), with a shared Oracle<F> type in src/protocol/oracle.rs that owns the codeword and a lazily-materialised multilinear extension. lib.rs is now an orchestrator that threads oracles and accumulator state through the phases. Adds tracing spans at each phase boundary, gated behind a new `profile` cargo feature that optionally pulls in tracing-subscriber via src/profile.rs. Release builds do not pull tracing-subscriber as a direct dep of warp. Sets up docs/paper-mods/ as a living spec of notation modifications to the Warp paper, paired 1:1 with Rust modules. mod1_oracle.tex authored in full; mod2/3/4 stubbed for downstream plans. Framing stays inside IOP/BCS rather than moving to AHP. No behavior change: 14/14 tests pass (BLS12-381 + Goldilocks warp_test, query + relation tests), clippy clean under --all-features.
Restructures src/profile.rs into a module directory with four parts,
all gated behind the existing `profile` cargo feature:
- counters: thread-local Cell<u64> counters with a `count_ops!` macro.
Snapshots + deltas let a subscriber diff across a span's lifetime.
Tracks coarse call-site events (MerkleTreeBuilds, MerklePathsGenerated,
MleMaterializations, OracleLeafQueries, OraclePointQueries,
TwinConstraintRounds, BatchingRounds, OodPointQueries, EncodeCalls,
MerklePathsVerified). Field-level op counts are out of scope — they
need an F newtype or an arkworks fork; deferred.
- timing: `thread_cpu_ns()` via clock_gettime(CLOCK_THREAD_CPUTIME_ID)
on Linux and macOS. Distinguishes blocked-on-IO from blocked-on-compute
in a way wall-time cannot.
- rss: `peak_rss_bytes()` via getrusage(RUSAGE_SELF), normalising the
Linux-kB vs macOS-bytes discrepancy.
- layer: a tracing `Layer` that captures a span's counters + timing +
rss on enter, differences them on close, and emits one newline-
delimited JSON record per span. Schema tag `warp.profile.v1`;
per-record fields are {phase, wall_ns, cpu_ns, rss_delta_bytes,
counters, dimensions}. Dimensions come from span numeric fields
(log_l, log_m, log_n, etc.), captured via a tracing Visit.
Phase modules and the Oracle are instrumented at call sites so the JSON
records carry meaningful counter deltas. Without the feature every
count_ops! call and every timing/rss wrapper compiles to a no-op.
tests/profile_json.rs is a feature-gated integration test that installs
the JSON layer against an in-memory sink, runs a full hashchain prove,
and asserts every phase emits a well-formed record. It's the reference
shape Plan B's regression detector will consume.
Verification matrix (all green):
- cargo test (no features) : 14/14
- cargo test --features profile : 14/14 + profile_json 1/1
- cargo clippy --all-targets : clean
- cargo clippy --all-targets --all-features : clean
Adds a deterministic instruction-count bench for WARP::prove, intended as the regression signal for a future CI gate. Complements the existing criterion wall-time bench (which stays informational). - benches/iai_phases.rs: one library_benchmark for `prove` at the unit-test parameter shape (l1=4, s=2, t=7, hashchain=10). v1 scope is intentionally narrow — the docstring explains why the parameterised `#[bench::case(setup = ...)]` form didn't compile under iai-callgrind 0.14 from this crate's bench root. Plumbing more sizes is deferred. - Cargo.toml: iai-callgrind 0.14 as a dev-dep, second `[[bench]]` entry wired with `harness = false`. - benches/docker/Dockerfile.iai: slim Debian image with valgrind + the iai-callgrind-runner binary pre-installed, for macOS hosts where valgrind has been unsupported since Big Sur. - Makefile: `bench-wall` (criterion, any host), `bench-ci` (iai natively, Linux/valgrind required), `bench-ci-local` (builds and runs the Docker image with cargo registry/git/target caches mounted from target/iai-docker-cache/ so arkworks isn't redownloaded each run). Plus `test` and `clippy` convenience targets. - benches/README.md: explains the split (noisy wall time vs deterministic instruction count), installation, Docker pathway, and v1 scope limits. Deferred (tracked in the plan): multiple parameter points, baseline capture + commit, and the GitHub Actions workflow that gates PRs. Those come as small separate commits so the CI change can be reviewed on its own. Verification: - cargo check / --features profile : clean - cargo build --bench iai_phases : clean (run needs valgrind) - cargo build --bench warp_rs : clean - cargo clippy --all-features : clean - cargo test : 14/14 (unchanged)
Adds src/params/ — given (λ, |F|, code rate, list-decoding regime),
pick the minimum (s, t) that achieves λ bits of soundness on a
Reed–Solomon code.
- types: Regime::{Provable, Conjectured}, SecurityLevel, Params,
SoundnessBound, ParamError. SoundnessBound::meets(λ) answers the
"is this enough?" question directly.
- select(λ, field_bits, code_rate, regime) → Params. Uses the
Johnson-bound proximity-query formula (provable: t ≥ 2λ/log₂(1/ρ),
conjectured: t ≥ λ/log₂(1/ρ)), with a field-admissibility check
(log₂|F| ≥ λ + 40) to ensure polylog noise terms are negligible.
- validate(params, field_bits, code_rate, regime, target): the
inverse — computes the achieved soundness and reports per-term
admissibility so callers can see partial failures.
- presets::PRESETS: canonical (λ, rate, regime) → (s, t) rows for
80/128-bit targets at common rates (1/2, 1/8). A test enforces
that every row matches `select` output, so drift between the
table and the formulas is caught at build time.
- src/bin/warp-params.rs: dependency-free CLI with `select`,
`validate`, and `table` subcommands. Dumps PRESETS as TSV and
exits non-zero if a validate call doesn't meet the target.
- docs/paper-mods/mod4_parameter_selection.tex: replaces the stub
with the full derivation, citing STIR / WHIR for the proximity-
gap bounds and marking the deferred items (batching-sumcheck
calibration of s, non-RS codes, a reference table with matching
proofs).
The hard-coded `s=8, t=7` in warp_test is intentionally unchanged
for this pass — those are functional-test shapes, not security
values. A later pass can thread PRESETS through callers that care
about real targets.
Verification:
- 23 tests pass (14 original + 9 new in params::tests)
- cargo clippy, both feature configs: clean
- ./target/debug/warp-params table prints PRESETS; select / validate
round-trip; validate correctly rejects insufficient (s, t)
Adds the two highest-value pieces of test hardening from the plan;
the rest (proptest, golden serialization, xtask ref lint, runtime
F-S harness) is deferred and tracked in the todo list.
tests/verifier_negative.rs
- Shared `make_fixture()` runs the full two-phase accumulation so l2 > 0
and every cleanly-reachable `VerifierError` variant is triggerable.
- One test per tamperable field, each confirming the verifier rejects
the proof with the *specific* expected error (not just "some error"):
* CodeEvaluationPoint (α tamper)
* CircuitEvaluationPoint (β.0 and β.1 tampers — two tests)
* NumShiftQueries (truncate shift_query_answers)
* ShiftQueryIndex (swap auth_0 paths)
* ShiftQuery (tamper shift_query_answer value)
* NumL2Instances (truncate auth_j)
* Target (tamper μ)
- A happy-path test keeps the fixture honest.
- Docstring enumerates the variants we did NOT reach through
single-field tampering (SpongeFish/ArkError wrap lower-level errors;
NumSumcheckRounds is transcript-derived; SumcheckRound is unraised
in current code).
docs/audits/fiat_shamir.md
- Ordered, line-for-line mapping of every prover-side transcript write
to its verifier-side read. 25 steps, each with file:line links on
both sides.
- A "what reviewers should spot-check" section calls out the specific
squeeze-before-absorb patterns F-S-soundness bugs tend to take.
- Scope: this is a **manual** audit (compensating control). The
runtime ordering harness that would make drift undetectable at CI
time is deferred because it requires instrumenting spongefish; noted
explicitly at the bottom of the doc.
Verification:
- 23/23 unit tests + 9/9 negative-path tests pass
- cargo clippy, both feature configs: clean
Five small fixups called out during the Plan T post-mortem. No
behaviour changes — all tests keep passing on both feature configs.
1. Move the big `warp_test` / `warp_test_goldilocks` suites from
`src/lib.rs` into `tests/integration_warp.rs`. They're end-to-end
prove / verify / decide runs; keeping them as inline unit tests
kept the file ~340 lines longer than necessary. `src/lib.rs` is
now 500 lines (down from 844); the test content is unchanged.
2. Normalise phase-fn visibility to `pub`. Before, `pesat::prove`
was `pub(crate)` while the other four phases' `prove` / `verify`
/ `verify_claim` entry points were `pub`. All output structs
(`Oracle`, `OodOutput`, `BatchingOutput`, etc.) are already `pub`,
so there was no reason for `pesat` to be the odd one out.
`PesatOutput` promoted from `pub(crate)` → `pub` for the same
reason.
3. Cross-reference lint audit — verified every module in `src/params`,
`src/protocol/phases`, plus `src/protocol/oracle.rs` and
`src/bin/warp-params.rs`, carries a doc-comment reference to its
paired `docs/paper-mods/modN_*.tex`. No drift.
4. `presets::lookup` now takes an exact `(num, den)` pair instead of
`f64` with an epsilon-compare. Preserves the caller's intent
(`1/2` is distinct from `0.5`); the CLI plumbs a small `Rate` enum
through parsing so ratios hit the preset table while bare decimals
fall through to the computed-value branch.
5. Remove the `let _ = Counter::ALL.len()` silencer in
`profile/layer.rs`. The import it was suppressing was an artifact
of an earlier iteration and isn't needed now; dropping it and the
`Counter` import lets clippy stay clean.
Verification:
- cargo test : 33/33 pass (22 unit + 2 integ
+ 9 negative)
- cargo test --features profile : 34/34 pass (adds profile_json)
- cargo clippy --all-features -Dwarnings : clean
- warp-params select verified with both `1/2` and `0.5` inputs
Captures the changelog entries that were uncommitted in the working tree at the start of the Plan 0 / O / B / P / T refactor session. Describes the already-landed sumcheck work in the preceding five commits (a66f122 .. 312e220): inner-product sumcheck 2-coefficient round messages, RoundPolyEvaluator adoption, twin-constraint coefficient count reduction, and the ark-ff rev pin.
Covers every personal Claude Code artefact (custom agents, plans, skills, settings.json, settings.local.json) for this repo — not just the local-settings file that was already ignored. Nothing under .claude/ has ever been tracked here, so this is a purely belt-and-suspenders tightening: the next `git add .` can't pick up anything from the directory.
Resolutions:
- Cargo.toml: adopt main's clean 0.6 dep stack — drop the stale
[patch.crates-io] block (z-tech/smallfp-absorb branch no longer
exists; algebra rev 285dac2 was 0.5-era), bump dev-deps to 0.6.0
to match. Use main's spongefish 0.7.0 + ark-codes z-tech fork
pin. Keep cleanup's tracing/profile additions and `profile` feat.
- src/lib.rs, src/relations/{description.rs,r1cs/mod.rs,r1cs/hashchain/relation.rs},
src/serialize.rs, src/protocol/mod.rs, src/error.rs: take
cleanup's structural versions (phase modules, AccumulatorWitness
struct, gr1cs migration via into_inner()/get_predicate_num_constraints).
- src/crypto/merkle/blake3.rs, src/protocol/domainsep/mod.rs:
delete (cleanup intent — replaced by ark_crypto_primitives::merkle_tree::configs
and protocol/transcript/{prover,verifier}.rs respectively).
- src/utils/poly.rs: keep (still used by protocol/phases/batching.rs).
- src/lib.rs: rewrite two `sumcheck_verify` call sites to the
5-arg API used by effsc main (returns SumcheckResult{challenges,
final_claim} — caller does the oracle check externally).
- src/protocol/phases/twin_constraint.rs: handle effsc main's
`coefficient_lsb::final_value` calling convention — odd halves
arrive empty in the singleton case. Compute h(singleton) directly
(MLE of u at α_singleton + bundled R1CS at z, β_singleton, scaled
by τ_singleton) and emit `[h, -h]` so g(0)+g(1) == h.
cargo test: 33 passed (22 unit + 2 integration + 9 verifier-negative).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| verifier_state: &mut VerifierState<'a>, | ||
| statement: &Self::Statement, | ||
| inputs: Self::VerifierInputs, | ||
| ) -> Result<(Self::ReducedStatement, Self::VerifierOutputs), VerifierError>; |
There was a problem hiding this comment.
I've found that there the prover and verifier code has a lot of common (at least for arguments), particularly for computing the next statement given a transcript. Maybe the function (statement, transcript) -> reduced_statement is something we want to expose in the trait to make any implementation less likely to have different implementations of that part between prover and verifier? I could see implementors start by copy-pasting the code from prover to verifier, then patching the code in the prover but forgetting to do so in the verifier, for example. Having a single place where this reduction happens would help prevent this failure mode.
There was a problem hiding this comment.
Actually I think argus has a similar "issue", so anything we come up with here should be ported to argus too.
There was a problem hiding this comment.
Okay I'll give a draft I think this is very useful thanks!
There was a problem hiding this comment.
| type ReducedStatement; | ||
| /// Oracles emitted by this IOR (prover view: full data, plus any private | ||
| /// reduced witness state). | ||
| type ProverOutputs; |
There was a problem hiding this comment.
I would split this into ProofString and ReducedWitness, since I expect that implementers will always have to split those two (the proof string is always sent to the verifier, the reduced witness is either fed into the next IOR or sent to the verifier).
There was a problem hiding this comment.
| &self, | ||
| verifier_state: &mut VerifierState<'a>, | ||
| statement: &Self::Statement, | ||
| inputs: Self::VerifierInputs, |
There was a problem hiding this comment.
I'm a bit confused by what inputs is supposed to be. It should express oracle access to previous proof strings, plus the one for the current reduction, right?
I'm wondering if we can express this oracle access as a tuple of partial functions fn: usize -> Option<Alphabet> that the verifier can query without reading (or seeing!) the whole proof string. This might also help for iBCS (where, essentially, the ARG wrapper around an IOP could implement such a partial function but additionally enforce that VC proofs must pass for the result to be Some(Alphabet)).
There was a problem hiding this comment.
Also yes, I'll make a draft at this and we can see how it looks.
There was a problem hiding this comment.
|
Keeping this for comments for a few days but want to move to this branch where ark-vc + ark-mt are integrated: #25 |
|
Let's try to move over to #25 where ark-vc + ark-mt are integrated as well as the suggestions above. |
What does this PR do?
efficient-sumcheck #96includes vectorization for Goldilocks (and 2,3 degree extensions) for Neon and AVX-512Profiled breakdown on Goldilocks field (hash chain R1CS, n=4096):