Avoid pinning solver variables too early when RHS is a union (#2839)#2839
Open
migeed-z wants to merge 1 commit intofacebook:mainfrom
Open
Avoid pinning solver variables too early when RHS is a union (#2839)#2839migeed-z wants to merge 1 commit intofacebook:mainfrom
migeed-z wants to merge 1 commit intofacebook:mainfrom
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
migeed-z
added a commit
to migeed-z/pyrefly
that referenced
this pull request
Mar 20, 2026
…k#2839) Summary: During the constraint resolution, when solving a constraint of the form: `Quantified(AnyStr) <: x | None` we expand the union and end up pinning `str` to `x` which causes a false positive since it pins the type var. This diff works around the issue by skipping the the subset check which pins the type var. Instread, we directly check `Quantified(AnyStr) <: x` RFC: I suspect this is related to typevar pinning and that fixing that is the right solution so I am not sure if we should be adding a workaround. For issue facebook#2644 Differential Revision: D97522732
4438d5e to
d3919c8
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
migeed-z
added a commit
to migeed-z/pyrefly
that referenced
this pull request
Mar 20, 2026
…k#2839) Summary: During the constraint resolution, when solving a constraint of the form: `Quantified(AnyStr) <: x | None` we expand the union and end up pinning `str` to `x` which causes a false positive since it pins the type var. This diff works around the issue by skipping the the subset check which pins the type var. Instread, we directly check `Quantified(AnyStr) <: x` RFC: I suspect this is related to typevar pinning and that fixing that is the right solution so I am not sure if we should be adding a workaround. For issue facebook#2644 Differential Revision: D97522732
d3919c8 to
fa556dd
Compare
migeed-z
added a commit
to migeed-z/pyrefly
that referenced
this pull request
Mar 20, 2026
…k#2839) Summary: During the constraint resolution, when solving a constraint of the form: `Quantified(AnyStr) <: x | None` we expand the union and end up pinning `str` to `x` which causes a false positive since it pins the type var. This diff works around the issue by skipping the the subset check which pins the type var. Instread, we directly check `Quantified(AnyStr) <: x` RFC: I suspect this is related to typevar pinning and that fixing that is the right solution so I am not sure if we should be adding a workaround. For issue facebook#2644 Differential Revision: D97522732
fa556dd to
d723ceb
Compare
migeed-z
added a commit
to migeed-z/pyrefly
that referenced
this pull request
Mar 20, 2026
…k#2839) Summary: During the constraint resolution, when solving a constraint of the form: `Quantified(AnyStr) <: x | None` we expand the union and end up pinning `str` to `x` which causes a false positive since it pins the type var. This diff works around the issue by skipping the the subset check which pins the type var. Instread, we directly check `Quantified(AnyStr) <: x` RFC: I suspect this is related to typevar pinning and that fixing that is the right solution so I am not sure if we should be adding a workaround. For issue facebook#2644 Differential Revision: D97522732
d723ceb to
ff4306d
Compare
…k#2839) Summary: Pull Request resolved: facebook#2839 During the constraint resolution, when solving a constraint of the form: `Quantified(AnyStr) <: x | None` we expand the union and end up pinning `str` to `x` which causes a false positive since it pins the type var. This diff works around the issue by skipping the the subset check which pins the type var. Instread, we directly check `Quantified(AnyStr) <: x` RFC: I suspect this is related to typevar pinning and that fixing that is the right solution so I am not sure if we should be adding a workaround. For issue facebook#2644 Differential Revision: D97522732
ff4306d to
8839f09
Compare
|
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
migeed-z
added a commit
to migeed-z/pyrefly
that referenced
this pull request
Mar 21, 2026
…nd cross-project consistency Summary: The primer classifier has been producing inconsistent results across runs — the same primer diff can be classified as 'improvement' in one run and 'regression' in another. This was observed on real PRs like facebook#2839 (altair TypeVar iterability) and facebook#2764 (overload resolution, 60+ projects). Three changes to improve reliability: 1. **Self-critique pass (Pass 1.5)**: After Pass 1 produces reasoning, a new pass checks it for factual errors — e.g., claiming dicts are not iterable, incorrect inheritance claims, wrong TypeVar constraint analysis. This catches hallucinations before they reach the verdict pass. Tested on PR facebook#2839 where it correctly identified that both constraints of `_C` (list and TypedDict) are iterable. 2. **Majority voting on verdict (Pass 2)**: Instead of a single verdict call, makes 5 independent calls and takes the majority. This reduces non-determinism where the same reasoning could be classified either way. Vote distribution is logged for transparency. 3. **Cross-project consistency enforcement**: After classifying all projects independently, groups them by error kind and enforces majority verdict within each group. This prevents the classifier from saying 'overload resolution improved' for one project and 'overload resolution regressed' for another with the same pattern. Also upgrades the default Anthropic model from claude-opus-4-20250514 to claude-opus-4-6 for better Pass 1 reasoning quality. Differential Revision: D97571454
meta-codesync bot
pushed a commit
that referenced
this pull request
Mar 22, 2026
…nd cross-project consistency (#2841) Summary: Pull Request resolved: #2841 The primer classifier has been producing inconsistent results across runs — the same primer diff can be classified as 'improvement' in one run and 'regression' in another. This was observed on real PRs like #2839 (altair TypeVar iterability) and #2764 (overload resolution, 60+ projects). Three changes to improve reliability: 1. **Self-critique pass (Pass 1.5)**: After Pass 1 produces reasoning, a new pass checks it for factual errors — e.g., claiming dicts are not iterable, incorrect inheritance claims, wrong TypeVar constraint analysis. This catches hallucinations before they reach the verdict pass. Tested on PR #2839 where it correctly identified that both constraints of `_C` (list and TypedDict) are iterable. 2. **Majority voting on verdict (Pass 2)**: Instead of a single verdict call, makes 5 independent calls and takes the majority. This reduces non-determinism where the same reasoning could be classified either way. Vote distribution is logged for transparency. 3. **Cross-project consistency enforcement**: After classifying all projects independently, groups them by error kind and enforces majority verdict within each group. This prevents the classifier from saying 'overload resolution improved' for one project and 'overload resolution regressed' for another with the same pattern. Also upgrades the default Anthropic model from claude-opus-4-20250514 to claude-opus-4-6 for better Pass 1 reasoning quality. According to gemni, this is a big upgrade :) so I am hoping to see improvement in the quality. Reviewed By: yangdanny97 Differential Revision: D97571454 fbshipit-source-id: 356f4b150e0c4886c2743abc17699e004da997f1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary:
During the constraint resolution, when solving a constraint of the form:
Quantified(AnyStr) <: x | Nonewe expand the union and end up pinningstrtoxwhich causes a false positive since it pins the type var.This diff works around the issue by skipping the the subset check which pins the type var. Instread, we directly check
Quantified(AnyStr) <: xRFC: I suspect this is related to typevar pinning and that fixing that is the right solution so I am not sure if we should be adding a workaround.
For issue #2644
Differential Revision: D97522732