Extract map_theoryt base class from arrayst#8992
Open
tautschnig wants to merge 8 commits into
Open
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces a new map_theoryt base class to factor out map-/array-theoretic reasoning infrastructure from arrayst, adjusting the inheritance chain to arrayst -> map_theoryt -> equalityt -> prop_conv_solvert.
Changes:
- Added
map_theoryt(header + constructor TU) containing shared state and helpers for index/equality tracking, Ackermann constraints, and constraint counting. - Updated
arraystto inherit frommap_theorytand removed now-shared members fromarrays.h. - Re-scoped several previously-
arraysthelper methods tomap_theoryt.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/solvers/flattening/map_theory.h |
Introduces the new base class API and shared state for map/array reasoning. |
src/solvers/flattening/map_theory.cpp |
Adds map_theoryt constructor implementation. |
src/solvers/flattening/arrays.h |
Switches arrayst to inherit from map_theoryt and removes moved members. |
src/solvers/flattening/arrays.cpp |
Updates implementations to use map_theoryt for extracted functionality. |
Comments suppressed due to low confidence (1)
src/solvers/flattening/arrays.cpp:47
- Several
map_theorytmethod implementations (e.g.,record_array_index,collect_indices,collect_arrays,add_array_constraint, etc.) are now defined inarrays.cpp. This undermines the stated goal of separating map-theoretic reasoning from array-specific encoding and makes linkage/dependencies less clear. Consider movingmap_theorytmethod definitions intomap_theory.cppand leavingarrays.cppto only implementarrayst-specific behavior.
void map_theoryt::record_array_index(const index_exprt &index)
{
// we are not allowed to put the index directly in the
// entry for the root of the equivalence class
// because this map is accessed during building the error trace
std::size_t number=arrays.number(index.array());
if(index_map[number].insert(index.index()).second)
update_indices.insert(number);
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8992 +/- ##
===========================================
+ Coverage 80.56% 80.58% +0.01%
===========================================
Files 1707 1709 +2
Lines 189134 189103 -31
Branches 73 73
===========================================
- Hits 152385 152380 -5
+ Misses 36749 36723 -26 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
352d242 to
b1fbf98
Compare
…ions test_json.desc was added as KNOWNBUG in 2020 (commit ecb0a5d) asserting an aspirational JSON shape that cbmc never produced. Two flaws: 1. The exit code assertion '^EXIT=0$' did not match cbmc's actual exit. main.c contains 'assert(a[index] == 1)' with a non-deterministic 'index', which fails verification — cbmc exits with 10. (The two sibling tests, test.desc and test_xml.desc, exercise '--show-array- constraints' without '--json-ui' / with '--xml-ui'; cbmc rejects those combinations and exits 1, which they correctly assert.) 2. The regex asserted list-shaped JSON values, e.g. '"arrayWith": [.*]', but the implementation has always emitted scalar counts: '"arrayWith": 2'. The test was therefore unmatchable on day one. Fix both: assert '^EXIT=10$' and rewrite the regex to match scalar counts ('\\d+'). Drop the KNOWNBUG tag so the JSON contract for '--show-array-constraints' is now pinned by an actually-running test. The test only applies to the default SAT backend: under '--cprover-smt2' or '--incremental-smt2-solver z3' cbmc bypasses the array theory entirely and emits no 'arrayConstraints' JSON block (and '--cprover-smt2' returns a different exit code). Tag the test 'broken-cprover-smt-backend no-new-smt' so it is skipped under those profiles, matching the convention used by other arrayst-dependent tests such as Array_operations2 / Bitfields1 / Computed-Goto1. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Separate map-theoretic reasoning (index tracking, equality tracking, Ackermann constraints, read-over-weakeq, extensionality, CDCL(T) propagation, constraint counting) from array-specific encoding (with/if/of/comprehension constraints, lazy selects, 2D inlining). Inheritance chain: arrayst -> map_theoryt -> equalityt -> prop_conv_solvert Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
…array'
In a map theory the tracked expressions are 'keys' and their collection
per equivalence class is the 'domain'. Rename identifiers, type
aliases, function names, and comments accordingly:
index_sett → key_sett
index_mapt → domain_mapt
index_map → domain_map
update_indices → update_keys
collect_indices() → collect_keys()
update_index_map() → update_domain_map()
record_array_index()→ record_array_key()
Rename all map_theoryt members, methods, types, enum values, and
comments to use map-theory terminology instead of array-specific terms:
Types:
array_equalityt → map_equalityt
array_equalitiest → map_equalitiest
array_constraint_countt → map_constraint_countt
Members:
array_equalities → map_equalities
arrays (union-find) → maps
lazy_array_constraints → lazy_constraints
lazy_arrays → lazy_dispatch
get_array_constraints → get_constraints
array_constraint_count → constraint_count
Methods:
record_array_equality → record_equality
record_array_let_binding→ record_let_binding
record_array_key → record_key
collect_arrays → collect_maps
is_unbounded_array → is_unbounded_map
add_array_constraint → add_map_constraint
add_array_Ackermann_constraints → add_Ackermann_constraints
add_array_constraints_equality → add_map_equality_constraints
display_array_constraint_count → display_constraint_count
finish_eager_conversion_arrays → finish_eager_conversion_maps
Enum values: ARRAY_* → MAP_*
C-level IR names (index_exprt, ID_index, etc.) are unchanged as they
refer to the array-indexing operation, not the map-theory concept.
Array-specific IR names (array_typet, index_exprt, ID_array,
array_comprehension_args, etc.) and arrayst-specific method names
(add_array_constraints, etc.) are unchanged.
Note that the JSON output of '--show-array-constraints --json-ui'
is intentionally NOT renamed: the user-visible strings stay
'arrayConstraints', 'arrayEquality', 'arrayWith', etc. The constraints
this option reports are recorded by 'arrayst' (not by 'map_theoryt'),
the CLI option name says 'array', and any consumer of the JSON output
shouldn't be forced to track an internal terminology change. The
mapping from internal MAP_* enum tags to 'arrayX' strings lives in
'enum_to_string'; a comment there explains the asymmetry.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Document all public and protected methods, key data members, enums, and structs in map_theory.h with Doxygen comments following the CBMC coding standard. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
A map is unbounded by definition; the unboundedness check that the
prior commits called 'is_unbounded_map' actually distinguishes between
arrays the array theory should manage and bounded arrays that the
flattening layer bit-blasts directly. That distinction is array-
specific and has no place on a map_theoryt base.
Rename the method back to 'is_unbounded_array' and move both its
declaration and its only override down to arrayst / boolbvt.
The same applies to 'collect_keys': it walks an expression looking
specifically for 'index_exprt' over array_typet operands, tracks
'array_comprehension_args' for ID_array_comprehension, and uses
'is_unbounded_array' to decide whether to record the index. None of
that is map-theoretic. Move both overloads of 'collect_keys' (and
'array_comprehension_args') from map_theoryt to arrayst.
The remaining map_theoryt primitives ('record_key',
'update_domain_map', 'add_Ackermann_constraints',
'add_map_equality_constraints') stay where they are: they are
genuinely about maps (key sets, equivalence classes, Ackermann
implications) and do not depend on array shape.
Side-effect: 'finish_eager_conversion_maps' becomes pure virtual on
map_theoryt, since its previous default body called the now-moved
'collect_keys'. Both existing overrides ('arrayst::add_array_constraints'
and 'bv_refinementt::finish_eager_conversion_maps') already provided
their own implementation, so no caller is affected.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical, behaviour-preserving tidying of the new map_theoryt class
and its arrayst subclass:
* Drop dead arrayst::message_handler member. The base equalityt
already takes a message_handler via the messaget log, so storing
it again on arrayst was redundant; nothing read the field.
* Drop dead incremental_cache flag and the expr_map cache it
guarded. No code path ever set incremental_cache to true, so the
cache was unused.
* Fix typo 'contraint_type_string' -> 'constraint_type_string' in
display_constraint_count.
* Align map_theory.cpp's author tag with map_theory.h (Michael
Tautschnig). The .cpp file was extracted in this PR; the
leftover Daniel Kroening tag came from the arrays.cpp ancestry.
* Add explicit '#include <map>' to map_theory.h. domain_mapt and
map_constraint_countt are std::map aliases; the include only
came in transitively via util/union_find.h.
* Drop spurious 'virtual' on record_key and collect_maps. Neither
is overridden in any subclass (arrayst doesn't override either),
so the keyword is misleading and prevents devirtualisation.
* Rename 'update_keys' -> 'dirty_classes'. The set holds
equivalence-class numbers (returned by union_find::number()),
not keys; the original 'update_indices' had the same flaw and
the rename in this PR was the chance to fix it.
* Rename 'lazy_dispatch' -> 'defer_constraints'. The boolean
controls whether refine=true constraints passed to
add_map_constraint are deferred for the refinement loop;
'defer_constraints' makes that role obvious at the call sites.
* Rename '_get_constraints' / 'get_constraints' constructor
parameter and member to 'collect_constraint_stats'. The old
name read like a getter and clashed visually with the
neighbouring 'constraint_count'. Propagated through
arrayst, boolbvt, bv_pointerst, bv_pointers_widet, and the
one solver_factoryt call site that still used the old
'get_array_constraints' shape.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
The two parallel enums on map_theoryt — lazy_typet (used to classify
deferred constraints in the refinement loop) and constraint_typet
(used for per-kind statistics in --show-array-constraints) — were
near-duplicates, differing only by the additional MAP_EQUALITY entry
in constraint_typet. Maintaining them in lock-step invited
forgotten-tag bugs, and every constraint-adding call site had to
mention both names:
lazy_constraintt lazy(lazy_typet::MAP_X, body);
add_map_constraint(lazy, refine);
++constraint_count[constraint_typet::MAP_X];
Replace both with a single map_constraint_kindt enum that carries
all nine kinds (MAP_EQUALITY is never enqueued lazily, but having it
in the same enum is harmless because lazy enqueueing is decided per
add_map_constraint call, not per enum value).
Add an inline helper
void add_constraint(
map_constraint_kindt kind,
const exprt &body,
bool refine = false);
that does the lazy_constraintt construction, the add_map_constraint
call, and the constraint_count bump in one go. Rewrite the eleven
call sites (ten in arrays.cpp's add_array_constraints_* helpers,
plus the Ackermann site in map_theory.cpp::add_Ackermann_constraints)
to use it. The MAP_EQUALITY counter in add_map_equality_constraints
stays as a direct constraint_count[]++ because that constraint is
asserted via prop.lcnf, not via add_map_constraint.
Net result: one enum to keep in sync, one call per constraint
site, and ~30 lines saved across the refactored files.
The user-visible JSON output of '--show-array-constraints' is
unaffected: enum_to_string still maps MAP_X to 'arrayX' strings.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
convert_let computes 'lowered_value' (the let's bound expression with
byte_update / byte_extract operators rewritten away by
lower_byte_operators) but then dropped it on the floor and passed
the *un*-lowered 'pair.second' to record_let_binding. If the
let-bound array value happens to contain a byte_update, the array
theory's collect_maps will see it and trip
DATA_INVARIANT(false,
"byte_update should be removed before collect_maps")
Fix: pass 'lowered_value' so the value handed to the array theory
matches the lowering already done at the call site.
The bug was introduced by commit 14105d5 ("Connect let-bound
arrays to originals in array theory"), which added the byte-operator
lowering computation but forgot to thread the result through to
record_let_binding.
Note on regression coverage: I attempted to construct a test case
that triggers the DATA_INVARIANT, but could not. The set of CBMC
input paths that produce an array-typed let_exprt (which is the
gate at line 78 above) is currently limited to:
* SMT2 input parsed by smt2_parser.cpp — but SMT2 has no
byte_update construct, so the let value cannot contain one.
* value_set_dereference.cpp — produces let-bound *pointer*-typed
values, not array-typed; the gate above filters them out.
So the bug is currently unreachable through standard CBMC entry
points, and the fix is defensive against a future code path that
might funnel byte-operator-bearing array expressions into a let.
The existing regression/smt2_solver/let-array/let-array.smt2 test
continues to exercise the convert_let array-typed branch and passes
unchanged with the fix.
Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Separate map-theoretic reasoning (index tracking, equality tracking, Ackermann constraints, read-over-weakeq, extensionality, CDCL(T) propagation, constraint counting) from array-specific encoding (with/if/of/comprehension constraints, lazy selects, 2D inlining).
Inheritance chain: arrayst -> map_theoryt -> equalityt -> prop_conv_solvert
Likely best reviewed commit-by-commit: the second commit introduces a lot of renaming, which impacts call sites. No (intended) functional changes in the second commit.