feat(verify): validator pattern — full i32 + i64 op surface (#76)#133
Merged
Conversation
Promotes the validator-based verification strategy from PR #113's single-op (`I32Add`) prototype to production coverage of the i32 + i64 op surface. This is the CompCert untrusted-solver + trusted-validator pattern: the instruction selector (untrusted) produces a lowering, and `Z3ArmValidator` (trusted, simple, auditable) confirms each lowering is semantics-preserving via Z3 SMT — instead of writing thousands of Rocq lemmas by hand. i64 is the highest-value target: it's exactly where hand-written Rocq proofs are most painful, and the validator models the 64-bit semantics directly, then decomposes into the lo/hi 32-bit register pair the ARM lowering uses and asserts equivalence. Coverage now certified by `Z3ArmValidator`: - i32: arithmetic, logic, shifts/rotates, comparisons, unary, div/rem. - i64: the same arithmetic/logic/shift/comparison surface, modeled as 64-bit bitvectors decomposed to the (lo, hi) register pair. The validator is trusted code — the Z3 query construction is kept readable so a reviewer can confirm the query actually models each op's semantics; the non-obvious i64 lo/hi decomposition is commented. Tests: per-op Z3-validation tests plus negative tests (a deliberately wrong lowering must NOT certify — a validator that certifies everything is useless). The 3 prototype tests from #113 still pass. synth-verify: 133 tests passing. `docs/validator-pattern.md` updated — coverage section now reflects the i32+i64 surface instead of the I32Add prototype. Refs: #76, PR #113 (prototype), #73 (the Rocq-divergence problem this strategy sidesteps). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #76. Promotes synth's validator-based verification from PR #113's single-op (
I32Add) prototype to production coverage of the full i32 + i64 op surface.This is CompCert's untrusted-solver + trusted-validator pattern: the instruction selector (untrusted) produces a lowering, and
Z3ArmValidator(trusted, simple, auditable) confirms each lowering is semantics-preserving via Z3 SMT — instead of writing thousands of Rocq lemmas by hand. It's the strategy that sidesteps the Rocq-divergence problem (#73).Coverage
Z3ArmValidatornow certifies:i64 is the highest-value target: it's exactly where hand-written Rocq proofs are most painful, so Z3 validation buys the most there.
Trusted-code discipline
The validator is trusted code. The Z3 query construction is kept readable so a reviewer can convince themselves each query genuinely models the op's semantics; the non-obvious i64 lo/hi decomposition is commented (the
i64_shl/i64_shr_s/i64_shr_ucase-split helpers).Tests
Per-op Z3-validation tests (the op certifies — the negated equivalence query is
unsat), plus negative tests: a deliberately wrong lowering must NOT certify — a validator that certifies everything is useless. The 3 prototype tests from #113 still pass. synth-verify: 133 tests passing.docs/validator-pattern.mdupdated — coverage section reflects the i32+i64 surface.Validation
cargo test --package synth-verify— 133 pass, 0 fail.cargo clippy --package synth-verify --all-targets -- -D warnings— clean.cargo fmt --check— clean.Test plan
Refs: #76 (closes), PR #113 (prototype), #73 (Rocq divergence this sidesteps).
🤖 Generated with Claude Code