Skip to content

Pass irep_idt by value (C++ Core Guidelines F.16) + CI checker#9014

Open
tautschnig wants to merge 26 commits into
diffblue:developfrom
tautschnig:f16-checker
Open

Pass irep_idt by value (C++ Core Guidelines F.16) + CI checker#9014
tautschnig wants to merge 26 commits into
diffblue:developfrom
tautschnig:f16-checker

Conversation

@tautschnig
Copy link
Copy Markdown
Collaborator

irep_idt is a dstringt, which holds a single unsigned table index. Passing it by const irep_idt & was historically defensible because irep_idt could be aliased to std::string via USE_STD_STRING -- but that compatibility was removed in 4d935bf14f ("Remove support for irep_idt-as-std::string"). With dstringt as the only target, by-value is strictly cheaper than by const reference and aligns with C++ Core Guidelines F.16.

What's in this PR

  • A CI check (scripts/check_pass_by_value.cpp + runner + .github/workflows/syntax-checks.yaml, modelled on the existing check-irep-copies job) that flags any new parmVarDecl of const T & where T is in a configurable cheap-types list (dstringt for now). Matches both regular method/function parameters and function-type-template-arg parameters; --fix rewrites only the former in place.
  • The 2,049 existing violations cleaned up, applied module-by-module via the same tool with --fix and formatted with git-clang-format-15 so unrelated lines stay untouched.

Stats

527 files, +2,996/−2,694. Findings: 2,049 → 0.

Testing

unit (551 cases), java-unit (110 cases), and ctest -L CORE (96 tests) all pass. git-clang-format-15 clean.

Notes for reviewers

  • The first six commits (checker + tool refinements) are independently useful; happy to split if preferred.
  • Twelve manual fixes were folded into the matching per-module commits: eleven const irep_idt & member declarations that would have become dangling references after their initialising parameter switched to by-value, and one return-by-reference helper in jbmc/unit/.
  • std::function<…(const irep_idt &…)…> typedefs were updated to std::function<…(irep_idt …)…> in lockstep with the lambdas/callables bound to them.

Limitations / follow-ups

  • The matcher does not catch anonymous parameters in non-instantiated typedefs of std::function (Clang doesn't synthesise a ParmVarDecl for them). None exist on develop; a FunctionProtoTypeLoc walker would close the gap if any reappear.
  • Extending CHEAP_TYPES (e.g. to goto_programt::const_targett) is a one-line change and the obvious next cleanup.

ABI

Source-compatible for callers; out-of-tree consumers of libcbmc need a recompile (the mangled names of f(irep_idt) and f(const irep_idt &) differ).

  • Each commit message has a non-empty body, explaining why the change was made.
  • n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

tautschnig and others added 26 commits May 26, 2026 12:38
Adds scripts/check_pass_by_value.cpp, a Clang LibTooling AST-matcher
program that flags function parameters of cheap-to-copy types declared
as 'const T &'. This is a violation of C++ Core Guidelines F.16
("for in parameters, pass cheaply-copied types by value and others by
reference to const").

The set of cheap types is configured in CHEAP_TYPES at the top of the
file. It currently lists dstringt (the canonical name for irep_idt,
which holds a single unsigned table index, so passing it by value is
strictly cheaper than passing by const reference).

Skips:
  * parameters of copy/move constructors and copy/move assignment
    operators (the language requires their signature)
  * parameters of function-template instantiations
  * member functions of class-template specialisations
  * parameters whose source-as-written type is a (substituted)
    template type parameter or otherwise dependent

Build with the same clang-18 flags used by check_irep_moves.cpp; the
resulting scripts/check-pass-by-value binary is .gitignored.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Adds scripts/run_pass_by_value_check.sh, which:

* (re)builds scripts/check-pass-by-value if needed (mirrors the build
  recipe used for scripts/check-irep-moves on origin/irept-copies),
* runs the tool over src/ and jbmc/src/ in parallel via xargs, with
  each worker writing to its own per-pid temp file to avoid stdout
  interleaving across workers,
* normalises absolute paths in the output to repo-relative paths so
  that the baseline is portable,
* diffs the result against scripts/pass_by_value_baseline.txt:
    * new entries cause an exit code of 1 with a clear error message,
    * removed entries are reported as a notice (no failure) with the
      regenerate command,
* supports --regenerate-baseline as the second positional argument.

Wall-clock cost on a 36-core box is ~55 s.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Records the 2049 currently-known sites where parameters of cheap-to-copy
types (dstringt / irep_idt) are passed by const reference.  These are
all real F.16 violations to be fixed incrementally; the baseline
mechanism allows CI to gate against introducing new ones without
requiring the wholesale fix in a single change.

Regenerate after a clean-up commit with:

    scripts/run_pass_by_value_check.sh build --regenerate-baseline

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Adds a check-pass-by-value job to .github/workflows/syntax-checks.yaml
that builds scripts/check-pass-by-value and runs the runner script on
every pull request to develop.  PRs introducing new F.16 violations
(parameters of cheap-to-copy types passed by const reference) will
fail this job; PRs that reduce the number of violations will pass and
emit a notice inviting the contributor to regenerate the baseline.

Mirrors the existing check-irep-copies job in shape and runtime budget.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
With --fix, the tool runs as a clang::tooling::RefactoringTool and
emits a Replacement for each finding that rewrites the parameter's
type from `const T &` to `T `, preserving the author's spelling
(`irep_idt` vs. `dstringt`).  The replacement range spans from the
parameter declaration's BeginLoc (start of `const`) to the parameter
name's Location, which both covers the `const` keyword and absorbs
any whitespace between `&` and the name.

Limitations: only ParmVarDecls are touched.  Function-pointer
typedefs, std::function template arguments, and friend declarations
that mention `const irep_idt &` are not rewritten and remain in the
baseline as a follow-up.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Three fixes uncovered while applying the rewriter to src/util:

* Add --path-filter so we can run on the full TU set (needed to ensure
  every header gets visited) but only modify files whose path contains
  the given substring.  Headers are sometimes only included from .cpps
  outside their owning module.

* Apply replacements manually rather than via runAndSave: the latter
  bails out without saving when any TU has a parse error, which we hit
  for optional solver back-ends with missing third-party headers.

* Skip ParmVarDecls that are not in their enclosing FunctionDecl's
  getParamDecl() list.  These show up for parameter names inside
  function-type template arguments such as
  `std::function<void(const irep_idt &k)>`, and rewriting them would
  desynchronise the in-class declaration from its out-of-class
  definition.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
Mechanical conversion of `const irep_idt &` parameters to pass-by-value
via scripts/check_pass_by_value.cpp --fix.  Aligns with C++ Core
Guidelines F.16 (cheap-to-copy types should be passed by value).
The author's chosen spelling (irep_idt vs. dstringt) is preserved.
Formatting was applied via git-clang-format-15 so unrelated lines
remain untouched.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
The earlier --fix mode was deliberately strict about which parameters
to rewrite, returning early on parameters that are not directly owned
by their enclosing FunctionDecl (e.g. parameter names inside a
function-type template argument like
\`std::function<void(const irep_idt &k)>\`).  That early-return also
suppressed the warning, which means CI would not catch regressions
where someone added a new such site.

Split the logic so:

* The "is this fixable in place" decision (parameter is in FD's own
  getParamDecl() list) only suppresses the Replacement, not the
  warning.
* All other guards (system header, copy/move ctor or assignment,
  function-template instantiation, member of class-template
  specialisation) continue to suppress the warning entirely.
* The dependent-type filter now walks the type-sugar chain rather
  than only the top-level type, so e.g. \`const value_type &\` --
  where value_type is a typedef inside a class template instantiated
  with a cheap key type -- is still recognised as template-derived
  and skipped.

Also documents one remaining limitation: anonymous parameters in
non-instantiated typedefs of std::function (e.g.
\`using cb = std::function<void(const irep_idt &)>\` at namespace
scope) don't produce a ParmVarDecl in Clang's AST and so are not
caught.  A future TypeLoc-based walker could close this gap.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
After the cleanup commits, scripts/check-pass-by-value reports zero
violations across src/, jbmc/src/, unit/ and jbmc/unit/.  The baseline
is regenerated to an empty file, so the CI check effectively becomes
"fail on any new pass-by-value violation of a cheap-to-copy type".

Also: the runner no longer treats an empty tool output as an error.
That state was previously unreachable (the baseline always had
~2,000 entries) and now is the steady state.

The baseline mechanism itself is retained for forward compatibility:
when CHEAP_TYPES is extended (for example to add
goto_programt::const_targett), the new findings can be temporarily
baselined while the corresponding cleanup PR is staged.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig self-assigned this May 26, 2026
Copilot AI review requested due to automatic review settings May 26, 2026 20:06
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of files (300). Try reducing the number of changed files and requesting a review from Copilot again.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

❌ Patch coverage is 92.18750% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.55%. Comparing base (808dae2) to head (97d76d8).

Files with missing lines Patch % Lines
src/goto-instrument/contracts/cfg_info.h 62.50% 3 Missing ⚠️
jbmc/src/java_bytecode/remove_exceptions.cpp 71.42% 2 Missing ⚠️
src/goto-checker/properties.cpp 33.33% 2 Missing ⚠️
...strument/contracts/dynamic-frames/dfcc_library.cpp 60.00% 2 Missing ⚠️
...c/java_bytecode/java_single_path_symex_checker.cpp 0.00% 1 Missing ⚠️
jbmc/src/java_bytecode/lazy_goto_functions_map.h 80.00% 1 Missing ⚠️
src/analyses/flow_insensitive_analysis.cpp 0.00% 1 Missing ⚠️
src/analyses/goto_rw.cpp 0.00% 1 Missing ⚠️
src/analyses/goto_rw.h 0.00% 1 Missing ⚠️
src/cpp/cpp_scope.cpp 66.66% 1 Missing ⚠️
... and 5 more
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9014      +/-   ##
===========================================
- Coverage    80.55%   80.55%   -0.01%     
===========================================
  Files         1707     1707              
  Lines       189051   189126      +75     
  Branches        73       73              
===========================================
+ Hits        152299   152359      +60     
- Misses       36752    36767      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@kroening
Copy link
Copy Markdown
Collaborator

Does this actually yield a speed-up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants