Extend flow system to consider simple predicates to avoid FP on initialized variables (#3253)#3253
Extend flow system to consider simple predicates to avoid FP on initialized variables (#3253)#3253migeed-z wants to merge 1 commit intofacebook:mainfrom
Conversation
|
@migeed-z has exported this pull request. If you are a Meta employee, you can view the originating Diff in D101709050. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…alized variables (facebook#3253) Summary: Pyrefly reports a false positive `unbound-name` error when a variable is defined inside `if a:` and used inside a subsequent `if a:` with the same condition. Since `a` is never modified, the variable is guaranteed to be initialized whenever `a` is truthy. This diff adds tracking to the flow system so we can keep track of this data. To start, we only handle predicates of the form `if a: ..`. However, I added some comments on where I think we can extend the infrastructure to more general predicates. Looking at the FP, it appears that adding support for even the basic predicates, decreases the FP by ~300. Differential Revision: D101709050
3c91f6d to
1ca6cbb
Compare
This comment has been minimized.
This comment has been minimized.
…alized variables (facebook#3253) Summary: Pyrefly reports a false positive `unbound-name` error when a variable is defined inside `if a:` and used inside a subsequent `if a:` with the same condition. Since `a` is never modified, the variable is guaranteed to be initialized whenever `a` is truthy. This diff adds tracking to the flow system so we can keep track of this data. To start, we only handle predicates of the form `if a: ..`. However, I added some comments on where I think we can extend the infrastructure to more general predicates. Looking at the FP, it appears that adding support for even the basic predicates, decreases the FP by ~300. Differential Revision: D101709050
1ca6cbb to
260fc86
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…alized variables (facebook#3253) Summary: Pyrefly reports a false positive `unbound-name` error when a variable is defined inside `if a:` and used inside a subsequent `if a:` with the same condition. Since `a` is never modified, the variable is guaranteed to be initialized whenever `a` is truthy. This diff adds tracking to the flow system so we can keep track of this data. To start, we only handle predicates of the form `if a: ..`. However, I added some comments on where I think we can extend the infrastructure to more general predicates. Looking at the FP, it appears that adding support for even the basic predicates, decreases the FP by ~300. Differential Revision: D101709050
260fc86 to
18191ca
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
18191ca to
d9a1b17
Compare
…alized variables (facebook#3253) Summary: Pyrefly reports a false positive `unbound-name` error when a variable is defined inside `if a:` and used inside a subsequent `if a:` with the same condition. Since `a` is never modified, the variable is guaranteed to be initialized whenever `a` is truthy. This diff adds tracking to the flow system so we can keep track of this data. To start, we only handle predicates of the form `if a: ..`. However, I added some comments on where I think we can extend the infrastructure to more general predicates. Looking at the FP, it appears that adding support for even the basic predicates, decreases the FP by ~300. Differential Revision: D101709050
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…alized variables (facebook#3253) Summary: Pyrefly reports a false positive `unbound-name` error when a variable is defined inside `if a:` and used inside a subsequent `if a:` with the same condition. Since `a` is never modified, the variable is guaranteed to be initialized whenever `a` is truthy. This diff adds tracking to the flow system so we can keep track of this data. To start, we only handle predicates of the form `if a: ..`. However, I added some comments on where I think we can extend the infrastructure to more general predicates. Looking at the FP, it appears that adding support for even the basic predicates, decreases the FP by ~300. Differential Revision: D101709050
d9a1b17 to
6371d33
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
…alized variables (facebook#3253) Summary: Pyrefly reports a false positive `unbound-name` error when a variable is defined inside `if a:` and used inside a subsequent `if a:` with the same condition. Since `a` is never modified, the variable is guaranteed to be initialized whenever `a` is truthy. This diff adds tracking to the flow system so we can keep track of this data. To start, we only handle predicates of the form `if a: ..`. However, I added some comments on where I think we can extend the infrastructure to more general predicates. Looking at the FP, it appears that adding support for even the basic predicates, decreases the FP by ~300. Differential Revision: D101709050
6371d33 to
58ca9d1
Compare
|
Diff from mypy_primer, showing the effect of this PR on open source code: openlibrary (https://github.com/internetarchive/openlibrary)
- ERROR openlibrary/plugins/upstream/borrow.py:180:28-39: `ia_itemname` may be uninitialized [unbound-name]
- ERROR openlibrary/plugins/upstream/borrow.py:180:47-54: `s3_keys` may be uninitialized [unbound-name]
cloud-init (https://github.com/canonical/cloud-init)
- ERROR tests/unittests/test_util.py:614:27-35: `confd_fn` may be uninitialized [unbound-name]
egglog-python (https://github.com/egraphs-good/egglog-python)
- ERROR python/egglog/egraph.py:1237:21-28: `egraphs` may be uninitialized [unbound-name]
- ERROR python/egglog/egraph.py:1244:17-24: `egraphs` may be uninitialized [unbound-name]
- ERROR python/egglog/egraph.py:1250:42-49: `egraphs` may be uninitialized [unbound-name]
rotki (https://github.com/rotki/rotki)
- ERROR rotkehlchen/db/drivers/gevent.py:211:117-127: `identifier` may be uninitialized [unbound-name]
- ERROR rotkehlchen/db/drivers/gevent.py:482:98-108: `identifier` may be uninitialized [unbound-name]
- ERROR rotkehlchen/tests/api/test_database.py:55:29-45: `backup2_contents` may be uninitialized [unbound-name]
- ERROR rotkehlchen/tests/api/test_database.py:57:29-45: `backup1_contents` may be uninitialized [unbound-name]
pytest (https://github.com/pytest-dev/pytest)
- ERROR src/_pytest/terminal.py:1266:17-26: `fullwidth` may be uninitialized [unbound-name]
- ERROR src/_pytest/terminal.py:1274:13-22: `fullwidth` may be uninitialized [unbound-name]
- ERROR src/_pytest/terminal.py:1281:13-22: `fullwidth` may be uninitialized [unbound-name]
- ERROR src/_pytest/terminal.py:1285:48-57: `fullwidth` may be uninitialized [unbound-name]
cryptography (https://github.com/pyca/cryptography)
- ERROR src/cryptography/hazmat/primitives/serialization/ssh.py:1091:25-34: `cert_body` may be uninitialized [unbound-name]
- ERROR src/cryptography/hazmat/primitives/serialization/ssh.py:1105:13-18: `nonce` may be uninitialized [unbound-name]
scikit-learn (https://github.com/scikit-learn/scikit-learn)
- ERROR sklearn/cluster/_agglomerative.py:388:13-22: `distances` may be uninitialized [unbound-name]
- ERROR sklearn/cluster/_agglomerative.py:422:35-44: `distances` may be uninitialized [unbound-name]
- ERROR sklearn/ensemble/_gb.py:922:28-40: `y_oob_masked` may be uninitialized [unbound-name]
- ERROR sklearn/ensemble/_gb.py:924:35-59: `sample_weight_oob_masked` may be uninitialized [unbound-name]
- ERROR sklearn/gaussian_process/_gpr.py:654:36-59: `log_likelihood_gradient` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_coordinate_descent.py:182:34-40: `X_mean` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_coordinate_descent.py:184:34-40: `X_mean` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_huber.py:66:24-33: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_least_angle.py:670:17-22: `coefs` may be uninitialized [unbound-name]
+ ERROR sklearn/linear_model/_least_angle.py:670:33-37: `coef` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_least_angle.py:835:26-31: `coefs` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_least_angle.py:836:46-55: `prev_coef` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_least_angle.py:841:36-42: `alphas` may be uninitialized [unbound-name]
+ ERROR sklearn/linear_model/_least_angle.py:907:17-22: `coefs` may be uninitialized [unbound-name]
+ ERROR sklearn/linear_model/_least_angle.py:915:35-39: `coef` may be uninitialized [unbound-name]
+ ERROR sklearn/linear_model/_least_angle.py:917:35-39: `coef` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_omp.py:139:13-18: `coefs` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_omp.py:147:43-48: `coefs` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_omp.py:272:13-18: `coefs` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_omp.py:285:43-48: `coefs` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_ridge.py:277:26-28: `sw` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/_ridge.py:294:27-29: `sw` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_base.py:834:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_base.py:866:41-52: `intercept_0` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_coordinate_descent.py:1294:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_coordinate_descent.py:1326:41-52: `intercept_0` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_coordinate_descent.py:1521:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/linear_model/tests/test_ridge.py:2415:41-50: `intercept` may be uninitialized [unbound-name]
- ERROR sklearn/model_selection/_validation.py:421:24-41: `train_scores_dict` may be uninitialized [unbound-name]
- ERROR sklearn/neighbors/_base.py:382:16-26: `neigh_dist` may be uninitialized [unbound-name]
- ERROR sklearn/preprocessing/_data.py:266:19-24: `mean_` may be uninitialized [unbound-name]
- ERROR sklearn/preprocessing/_data.py:283:45-51: `scale_` may be uninitialized [unbound-name]
pwndbg (https://github.com/pwndbg/pwndbg)
- ERROR pwndbg/aglib/nearpc.py:610:17-25: `pair_map` may be uninitialized [unbound-name]
- ERROR pwndbg/aglib/nearpc.py:610:27-34: `pair_id` may be uninitialized [unbound-name]
- ERROR pwndbg/aglib/nearpc.py:610:36-51: `maximum_pair_id` may be uninitialized [unbound-name]
setuptools (https://github.com/pypa/setuptools)
- ERROR setuptools/_vendor/autocommand/autoasync.py:140:43-50: `new_sig` may be uninitialized [unbound-name]
manticore (https://github.com/trailofbits/manticore)
- ERROR manticore/core/smtlib/constraints.py:206:59-76: `constant_bindings` may be uninitialized [unbound-name]
+ ERROR manticore/core/smtlib/constraints.py:213:25-42: `constant_bindings` may be uninitialized [unbound-name]
mongo-python-driver (https://github.com/mongodb/mongo-python-driver)
- ERROR pymongo/asynchronous/pool.py:292:51-56: `start` may be uninitialized [unbound-name]
- ERROR pymongo/message.py:1142:39-53: `ns_doc_encoded` may be uninitialized [unbound-name]
- ERROR pymongo/pyopenssl_context.py:131:56-61: `start` may be uninitialized [unbound-name]
- ERROR pymongo/synchronous/pool.py:292:51-56: `start` may be uninitialized [unbound-name]
stone (https://github.com/dropbox/stone)
- ERROR stone/backends/python_rsrc/stone_base.py:19:9-15: `typing` may be uninitialized [unbound-name]
websockets (https://github.com/aaugustin/websockets)
- ERROR src/websockets/asyncio/connection.py:1219:17-27: `exceptions` may be uninitialized [unbound-name]
- ERROR src/websockets/asyncio/connection.py:1235:17-27: `exceptions` may be uninitialized [unbound-name]
- ERROR src/websockets/frames.py:261:37-47: `mask_bytes` may be uninitialized [unbound-name]
- ERROR src/websockets/frames.py:327:42-52: `mask_bytes` may be uninitialized [unbound-name]
- ERROR src/websockets/legacy/framing.py:102:37-46: `mask_bits` may be uninitialized [unbound-name]
- ERROR src/websockets/legacy/protocol.py:1610:17-27: `exceptions` may be uninitialized [unbound-name]
- ERROR src/websockets/legacy/protocol.py:1623:17-27: `exceptions` may be uninitialized [unbound-name]
zulip (https://github.com/zulip/zulip)
- ERROR zerver/lib/import_realm.py:1219:19-25: `bucket` may be uninitialized [unbound-name]
- ERROR zerver/lib/import_realm.py:1280:17-41: `filename_to_has_original` may be uninitialized [unbound-name]
- ERROR zerver/lib/streams.py:1421:51-69: `default_stream_ids` may be uninitialized [unbound-name]
static-frame (https://github.com/static-frame/static-frame)
- ERROR static_frame/core/db_util.py:576:32-44: `query_create` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9574:37-48: `index_names` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9578:46-57: `index_depth` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9580:36-49: `columns_names` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9583:55-66: `index_depth` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9586:35-46: `filter_func` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9600:24-35: `index_depth` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9601:39-51: `index_values` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9603:43-54: `filter_func` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9607:44-56: `index_values` may be uninitialized [unbound-name]
- ERROR static_frame/core/frame.py:9609:47-58: `filter_func` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:175:48-57: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:180:45-54: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:271:48-57: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/type_blocks.py:276:45-54: `drop_mask` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1398:20-24: `rows` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1403:21-28: `columns` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1405:21-28: `columns` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1415:22-26: `rows` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1415:28-35: `columns` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1552:27-38: `is_not_none` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:1561:20-31: `is_not_none` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:2031:13-24: `values_post` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:2065:9-20: `values_post` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3140:40-44: `cols` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3147:40-44: `cols` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3175:48-52: `cols` may be uninitialized [unbound-name]
- ERROR static_frame/core/util.py:3208:61-65: `cols` may be uninitialized [unbound-name]
pandas (https://github.com/pandas-dev/pandas)
- ERROR pandas/core/common.py:603:32-41: `old_value` may be uninitialized [unbound-name]
mypy (https://github.com/python/mypy)
- ERROR mypy/test/testpep561.py:159:23-30: `program` may be uninitialized [unbound-name]
- ERROR mypyc/irbuild/ll_builder.py:2787:37-48: `false_block` may be uninitialized [unbound-name]
meson (https://github.com/mesonbuild/meson)
- ERROR mesonbuild/backend/ninjabackend.py:717:20-52: `captured_compile_args_per_target` may be uninitialized [unbound-name]
- ERROR mesonbuild/dependencies/dub.py:319:38-45: `winlibs` may be uninitialized [unbound-name]
- ERROR mesonbuild/modules/java.py:75:35-52: `sanitized_package` may be uninitialized [unbound-name]
- ERROR unittests/allplatformstests.py:5465:27-29: `cc` may be uninitialized [unbound-name]
core (https://github.com/home-assistant/core)
- ERROR homeassistant/components/bayesian/config_flow.py:557:25-34: `sub_entry` may be uninitialized [unbound-name]
- ERROR homeassistant/components/recorder/util.py:126:49-60: `timer_start` may be uninitialized [unbound-name]
- ERROR homeassistant/components/tplink/config_flow.py:459:27-29: `un` may be uninitialized [unbound-name]
- ERROR homeassistant/components/tplink/config_flow.py:459:34-36: `pw` may be uninitialized [unbound-name]
- ERROR homeassistant/components/tplink/entity.py:678:45-64: `has_parent_entities` may be uninitialized [unbound-name]
- ERROR homeassistant/helpers/entity.py:1375:17-28: `update_warn` may be uninitialized [unbound-name]
paasta (https://github.com/yelp/paasta)
- ERROR paasta_tools/utils.py:2880:29-36: `service` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2882:31-40: `component` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2883:27-35: `loglevel` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2884:29-36: `cluster` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2885:30-38: `instance` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2893:25-32: `service` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2895:27-36: `component` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2896:23-31: `loglevel` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2897:25-32: `cluster` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2898:26-34: `instance` may be uninitialized [unbound-name]
- ERROR paasta_tools/utils.py:2910:13-22: `proctimer` may be uninitialized [unbound-name]
xarray (https://github.com/pydata/xarray)
+ ERROR xarray/plot/facetgrid.py:261:13-17: `nrow` may be uninitialized [unbound-name]
+ ERROR xarray/plot/facetgrid.py:262:13-17: `ncol` may be uninitialized [unbound-name]
spack (https://github.com/spack/spack)
- ERROR lib/spack/spack/config.py:728:34-42: `comments` may be uninitialized [unbound-name]
kornia (https://github.com/kornia/kornia)
- ERROR kornia/feature/disk/disk.py:117:39-40: `h` may be uninitialized [unbound-name]
- ERROR kornia/feature/disk/disk.py:117:43-44: `w` may be uninitialized [unbound-name]
spark (https://github.com/apache/spark)
- ERROR python/pyspark/pandas/indexes/base.py:1597:57-69: `sequence_col` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:263:32-38: `psser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:263:40-45: `pser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:264:32-38: `psser2` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:264:40-45: `pser2` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:332:32-38: `psser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:332:40-45: `pser1` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:333:32-38: `psser2` may be uninitialized [unbound-name]
- ERROR python/pyspark/pandas/tests/indexes/test_indexing_loc.py:333:40-45: `pser2` may be uninitialized [unbound-name]
+ ERROR python/pyspark/tests/upstream/pyarrow/test_pyarrow_array_cast.py:659:21-23: `np` may be uninitialized [unbound-name]
prefect (https://github.com/PrefectHQ/prefect)
- ERROR src/prefect/cli/deployment.py:750:48-70: `parsed_interval_anchor` may be uninitialized [unbound-name]
|
Primer Diff Classification✅ 24 improvement(s) | ➖ 1 neutral | 25 project(s) total | +8, -129 errors 24 improvement(s) across openlibrary, cloud-init, egglog-python, rotki, pytest, cryptography, scikit-learn, pwndbg, setuptools, mongo-python-driver, stone, websockets, zulip, static-frame, pandas, mypy, meson, core, paasta, xarray, spack, kornia, spark, prefect.
Detailed analysis✅ Improvement (24)openlibrary (-2)
cloud-init (-1)
egglog-python (-3)
rotki (-4)
pytest (-4)
cryptography (-2)
scikit-learn (+4, -28)
Looking at the code structure: at lines 592-603, The complication is the The 4 new errors are false positives because the code correctly ensures each variable is defined before use on every execution path. The net effect is strongly positive: 28 FPs removed vs 4 FPs added, for a net reduction of 24 false positives. Pyright also flags these 4 cases (4/4 per cross-check data), confirming this is a known hard pattern for static flow analysis.
pwndbg (-3)
setuptools (-1)
mongo-python-driver (-4)
stone (-1)
websockets (-7)
zulip (-3)
These changes directly eliminate the false positive static-frame (-28)
pandas (-1)
mypy (-2)
meson (-4)
Specifically:
This is a clear improvement in analysis precision, correctly handling the common Python pattern of guarding variable usage with the same condition under which the variable was defined.
core (-6)
This directly addresses the false positive pattern where paasta (-11)
xarray (+2)
spack (-1)
kornia (-2)
These are indeed false positives. The
spark (+1, -9)
Per-category reasoning:
prefect (-1)
➖ Neutral (1)manticore (+1, -1)
Looking at the code structure in
The old error was at line 206 (inside the for-loop's The PR's guard-tracking feature successfully resolves the false positive at line 206 by recognizing that the The net effect is one false positive removed (line 206) and one false positive remaining at a different location (line 213). The error count stays at 1. Both errors are the same class of false positive (unbound-name for
Was this helpful? React with 👍 or 👎 Classification by primer-classifier (25 LLM) |
Co-authored-by: Zeina Migeed <migeedz@meta.com>
Summary:
Pyrefly reports a false positive
unbound-nameerror when a variable isdefined inside
if a:and used inside a subsequentif a:with the samecondition. Since
ais never modified, the variable is guaranteed to beinitialized whenever
ais truthy.This diff adds tracking to the flow system so we can keep track of this data. To start, we only handle predicates of the form
if a: ... However, I added some comments on where I think we can extend the infrastructure to more general predicates.Looking at the FP, it appears that adding support for even the basic predicates, decreases the FP by ~300.
Differential Revision: D101709050