Shared/Java: Introduce a shared control flow reachability library and replace the Java Nullness implementation. #20367

aschackmull · 2025-09-04T13:07:21Z

This introduces a new shared library for computing control flow reachability. The computation follows the value of a single variable through value-preserving SSA definitions (i.e. phi nodes, plus optionally uncertain writes). Splitting on finally blocks and other variables is performed to verify the validity of the computed path.
Two output predicates are exposed: flow, which indicates a path from a source to a sink, and escapeFlow, which indicates that a source may escape in the sense that it circumvents all sinks.

This new library is then instantiated for Java and used to replace the Java Nullness analysis. This generally results in a precision improvement, although there are a few feature gaps for which we may have regressions compared to the old implementation. These gaps are

~~Properly observing certain assertion calls related to nullness.~~
Splitting based on correlated conditions of the form x == y.
Splitting based on correlated conditions of the form x instanceof Foo.

Commit-by-commit review is encouraged.

java/ql/lib/semmle/code/java/controlflow/ControlFlow.qll

@@ -0,0 +1,54 @@
+import java
+private import codeql.controlflow.ControlFlow
+private import semmle.code.java.dataflow.SSA as SSA


java/ql/lib/semmle/code/java/dataflow/Nullness.qll

shared/controlflow/codeql/controlflow/ControlFlow.qll

Copilot

Pull Request Overview

This PR introduces a new shared control flow reachability library and replaces the Java Nullness implementation with it. The shared library tracks a single variable through value-preserving SSA definitions and uses splitting on finally blocks and guard values to improve precision. The Java Nullness analysis is then updated to use this new approach, which provides more accurate tracking of null values through control flow paths.

Key changes:

Added shared control flow reachability computation with finally block tracking
Implemented integer range guard values and improved guard value intersection logic
Simplified Java Nullness implementation to use the new shared library

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
shared/controlflow/codeql/controlflow/Guards.qll	Added integer range guard values and intersection logic for improved guards analysis
shared/controlflow/codeql/controlflow/ControlFlow.qll	New shared control flow reachability library with SSA tracking and finally block handling
java/ql/lib/semmle/code/java/controlflow/ControlFlow.qll	Java instantiation of the shared control flow library
java/ql/lib/semmle/code/java/controlflow/BasicBlocks.qll	Updated to use new successor type system from shared library
java/ql/lib/semmle/code/java/ControlFlowGraph.qll	Added successor type mapping for control flow edges
java/ql/lib/semmle/code/java/dataflow/Nullness.qll	Simplified nullness analysis to use new shared library
java/ql/test/query-tests/Nullness/*.java	Updated test cases reflecting precision improvements
java/ql/test/query-tests/Nullness/NullMaybe.expected	Expected test results with improved precision

Copilot · 2025-09-04T13:15:32Z

shared/controlflow/codeql/controlflow/Guards.qll

+    TIntRange(int bound, Boolean upper) {
+      exists(ConstantExpr c | c.asIntegerValue() + [-1, 0, 1] = bound) and
+      bound != 2147483647 and
+      bound != -2147483648
+    } or


[nitpick] The integer bounds 2147483647 and -2147483648 appear to be hardcoded limits for 32-bit integers. Consider using named constants or a more explicit representation to make the intent clearer and improve maintainability.

shared/controlflow/codeql/controlflow/ControlFlow.qll

aschackmull · 2025-09-05T09:55:39Z

Dca shows a bit too much precision regression, so taking this back into draft while I investigate.

shared/controlflow/codeql/controlflow/ControlFlow.qll

michaelnebel · 2025-09-15T07:54:53Z

shared/controlflow/codeql/controlflow/Guards.qll

    TValue(TAbstractSingleValue val, Boolean isVal) or
+    TIntRange(int bound, Boolean upper) {
+      exists(ConstantExpr c | c.asIntegerValue() + [-1, 0, 1] = bound) and
+      bound != 2147483647 and


Total cornercase / nit question: If c holds the maximum integer value of 2147483647 shouldn't we then have TIntRange(2147483646, _) and TIntRange(2147483647, _) (the latter is excluded)?

For edge cases like TIntRange(2147483647, upper) we can easily run into trouble: If upper = false then this is likely a singleton value set, and quite unlikely as a useful bound. And with upper = true, the bound becomes trivial assuming that it applies to e.g. a Java int, but worse is that the default computation of its dual lower bound overflows in QL. Simply excluding the edge cases as valid bounds helps avoid that.

I'll add a comment in the code.

michaelnebel · 2025-09-15T09:19:57Z

shared/controlflow/codeql/controlflow/ControlFlow.qll

+      not irrelevantFinally(finally) and
+      bb1.getASuccessor(t) = bb2 and
+      n1 = getEnclosingAstNode(bb1.getLastNode()) and
+      n2 = getEnclosingAstNode(bb2.getNode(0)) and
+      inFinally(n1, finally) and
+      not inFinally(n2, finally) and
+      if t instanceof AbruptSuccessor then abrupt = true else abrupt = false


Could this be re-factored into helper predicate(s) and re-used?
The logic tries to capture whether we are on the "boundary" of a final block (the same goes for entersFinally).
Would it be possible to do something like

private predicate auxFinally(BasicBlock bb1, BasicBlock bb2, FinallyBlock finally) { exists(AstNode n1, AstNode n2 | not irrelevantFinally(finally) and bb1.getASuccessor() = bb2 and n1 = getEnclosingAstNode(bb1.getLastNode()) and n2 = getEnclosingAstNode(bb2.getNode(0)) and not inFinally(n1, finally) and inFinally(n2, finally) ) } private boolean isAbrupt(BasicBlock bb1, BasicBlock bb2) { exists(SuccessorType t | bb1.getASuccessor(t) = bb2 and if t instanceof AbruptSuccessor then result = true else result = false ) }

Or maybe this is just overcomplicating things 😄

It's true that there's a certain overlap in the logic between entersFinally and leavesFinally, but I'm not sure that I see a nice way to share that overlap - we can't really project n1 and n2 away before we get to the part that makes the predicates different, so your auxFinally above only applies to one of the cases. And then we're stuck with a "helper" predicate with a lot of columns. Also, the most restrictive part of these predicates are likely the inFinally part, which is what's different, so I expect a pipeline to either start there or join it early, and then a "helper" predicate refactor will possibly need to be inlined and that obscures reading with join orders in mind.

Oh yes, my bad, we can't have the getASuccessor in the helper predicate - then we definitely would need to inline to avoid a combinatorial explosion.
Also, thank you for thinking about it.

michaelnebel · 2025-09-15T09:27:50Z

shared/controlflow/codeql/controlflow/ControlFlow.qll

+   * A stack of split values to track whether entered finally blocks have
+   * waiting completions.
+   */
+  private class FinallyStack extends TFinallyStack {


Would it make sense to try and make a parameterized module (not in this PR) for constructing finite size lists/stacks?

Possibly yes. Do we know of any other use-cases?

Hmm, the other list like implementations I could find have more complicated constraints on the Cons like branch (typically involving the list itself), so maybe there are no cases where it can be re-used.

michaelnebel · 2025-09-15T09:29:43Z

shared/controlflow/codeql/controlflow/ControlFlow.qll

+  class GuardValue {
+    string toString();
+
+    GuardValue getDualValue();


A QlDoc for this predicate would be nice.

michaelnebel · 2025-09-15T09:36:40Z

shared/controlflow/codeql/controlflow/ControlFlow.qll

+  }
+
+  /** An input configuration for control flow reachability. */
+  signature module ConfigSig {


Very nice that the predicates are named similar to those of dataflow configurations.

michaelnebel

Looks plausible to me!
As always, very beautiful!

michaelnebel

Looks plausible to me. 🎉

hvitved · 2025-09-16T09:03:38Z

shared/controlflow/codeql/controlflow/ControlFlow.qll

+  LocationSig Location, BB::CfgSig<Location> Cfg,
+  InputSig<Location, Cfg::ControlFlowNode, Cfg::BasicBlock> Input>
+{
+  private module Cfg_ = Cfg;


Why is this alias needed?

Because imports resolve to file modules instead of local names when there's an ambiguity - and this file sits next to Cfg.qll.

hvitved · 2025-09-16T09:12:36Z

shared/controlflow/codeql/controlflow/ControlFlow.qll

+     * Holds if the value of `def` at `node` is a source for the reachability
+     * computation.
+     */
+    predicate source(ControlFlowNode node, SsaDefinition def);


Should this be named isSource instead like in data flow configs (and same for the other predicates)?

In a vacuum I think I prefer the shorter names, but I have no strong opinion here.

hvitved · 2025-09-16T09:15:02Z

shared/controlflow/codeql/controlflow/ControlFlow.qll

+  /**
+   * Constructs a control flow reachability computation.
+   */
+  module Flow<ConfigSig Config> {


Should this be called Local or perhaps LocalFlow instead?

I don't think we'll add a Global counterpart - so I prefer the simple Flow name here.

aschackmull requested a review from a team as a code owner September 4, 2025 13:07

Copilot AI review requested due to automatic review settings September 4, 2025 13:07

github-actions bot added the Java label Sep 4, 2025

github-advanced-security bot found potential problems Sep 4, 2025

View reviewed changes

Copilot AI reviewed Sep 4, 2025

View reviewed changes

aschackmull force-pushed the shared/controlflow branch from 1c40266 to 209628f Compare September 4, 2025 13:20

aschackmull marked this pull request as draft September 5, 2025 09:55

aschackmull force-pushed the shared/controlflow branch from e534cb0 to 0b5bb6a Compare September 8, 2025 08:46

github-advanced-security bot found potential problems Sep 8, 2025

View reviewed changes

shared/controlflow/codeql/controlflow/ControlFlow.qll Fixed Show fixed Hide fixed

aschackmull force-pushed the shared/controlflow branch 3 times, most recently from 5ab759a to b9f5f55 Compare September 8, 2025 09:44

aschackmull mentioned this pull request Sep 12, 2025

Java: Consolidate Assertions.qll and Preconditions.qll. #20377

Merged

aschackmull added 4 commits September 12, 2025 13:38

Java: Preparatory Nullness refactor.

db1f399

Guards: Support integer ranges.

1ebdcdf

Java: Improve precision of SuccessorType labels in CFG.

924a8ea

Java: Add some more nullness tests.

452bbf7

aschackmull force-pushed the shared/controlflow branch from b9f5f55 to 2e075c8 Compare September 12, 2025 11:38

aschackmull added 7 commits September 12, 2025 15:41

Shared: Add control flow reachability lib.

4a8ffea

Java: Replace nullness implementation.

03321ff

Java: Clean up IntegerGuards.qll

60d07cf

Java: Accept guards test results.

e8f1ec6

Guards: Include ConditionalExpr in exprHasValue.

2743fc0

Java: Minor nullness cleanup.

f9ffee0

Java: Accept qltest change.

e302616

aschackmull force-pushed the shared/controlflow branch from f14ab6b to e302616 Compare September 12, 2025 13:42

aschackmull marked this pull request as ready for review September 12, 2025 13:45

michaelnebel reviewed Sep 15, 2025

View reviewed changes

Java: Add a change note, and a minor ql comment.

b308c54

github-actions bot added the documentation label Sep 15, 2025

Shared: Minor precision improvement.

be39c4c

michaelnebel reviewed Sep 15, 2025

View reviewed changes

Shared: Copy some qldoc from Guards.qll

acb4d9f

michaelnebel reviewed Sep 15, 2025

View reviewed changes

michaelnebel approved these changes Sep 16, 2025

View reviewed changes

aschackmull merged commit 57e15b9 into github:main Sep 16, 2025
42 of 43 checks passed

aschackmull deleted the shared/controlflow branch September 16, 2025 08:44

hvitved reviewed Sep 16, 2025

View reviewed changes

MathiasVP mentioned this pull request Sep 18, 2025

C++: Switch to the shared Guards library #20485

Merged

Shared/Java: Introduce a shared control flow reachability library and replace the Java Nullness implementation. #20367

Shared/Java: Introduce a shared control flow reachability library and replace the Java Nullness implementation. #20367

Uh oh!

Conversation

aschackmull commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Check warning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

aschackmull commented Sep 5, 2025

Uh oh!

Uh oh!

michaelnebel Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelnebel Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelnebel Sep 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelnebel left a comment

Choose a reason for hiding this comment

Uh oh!

michaelnebel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aschackmull commented Sep 4, 2025 •

edited

Loading

michaelnebel Sep 15, 2025 •

edited

Loading

michaelnebel Sep 15, 2025 •

edited

Loading

michaelnebel Sep 15, 2025 •

edited

Loading