Skip to content

close #923 support pkgutil-style namespace packages#2778

Open
mdbernard wants to merge 1 commit intofacebook:mainfrom
mdbernard:pkgutil-namespace-packages
Open

close #923 support pkgutil-style namespace packages#2778
mdbernard wants to merge 1 commit intofacebook:mainfrom
mdbernard:pkgutil-namespace-packages

Conversation

@mdbernard
Copy link

@mdbernard mdbernard commented Mar 11, 2026

Summary

This change adds support for legacy pkgutil-style namespace packages, including those using pip's compat editable install mode. This is already supported by popular language servers like Jedi and Pyright, and lack of support in Pyrefly makes developers of those repos reluctant to switch despite Pyrefly's much faster speed and lower resource utilization.

Notably, Mypy does not support pkgutil-style packages (see python/mypy#9393), requiring developers of these types of repos to configure massive MYPYPATH environment variables to include all the dependency folders, so this change makes Pyrefly not only a great language server for these typs of repos, but also makes it work out-of-the-box as a type-checker.

Two similar, but distinct, cases are resolved in this change. Each is demonstrated below.

Case 1: RegularPackage defined across multiple roots

  1. Clone the azure-sdk-for-python repo.
  2. Set up a new virtual environment (I used CPython 3.14).
  3. Run the following commands to set up an editable install for an arbitrary project in that repo which imports from azure-core, and also editably install azure-core:
cd sdk/keyvault/azure-keyvault-secrets
pip install -r dev_requirements.txt -e . --config-settings editable-mode=compat
cd ../../..
cd sdk/core/azure-core
pip install -r dev_requirements.txt -e . --config-settings editable-mode=compat

In your virtual environment's site-packages folder, you should see the following:

__editable__.azure_keyvault_secrets-4.10.1.pth
__editable__.azure_core-1.38.4.pth
  1. Configure your editor to use the latest Pyrefly release as its language server.
  2. Open sdk/keyvault/azure-keyvault-secrets/azure/keyvault/secrets/_client.py.

In this file, imports from azure.core are Unknown, despite azure.core being installed:

image

The problem here is that when searching for the azure package across the search roots, the find_one_part function short-circuits after finding a RegularPackage named azure. It only returns the first root found--in this case the one containing keyvault, discarding azure.core since it is defined in a different root :

Some(result) => return Some((result, roots.cloned().collect::<Vec<_>>())),
.

The solution here is to update the FindResult::RegularPackage variant to be able to contain multiple root directory paths. We update find_one_part to no longer short-circuit, instead accumulating roots for RegularPackages, similar to what we do for NamespacePackages. Finally we update the callers of find_one_part to search over all roots until a best match is found.

Once compiled, we can verify in our editor that our LSP can find the file:

image

Case 2: RegularPackage and NamespacePackage share a namespace

The setup for this case is similar to Case 1, with the exception that we do not editably install azure-core.

  1. Clone the azure-sdk-for-python repo.
  2. Set up a new virtual environment (I used CPython 3.14).
  3. Run the following commands to set up an editable install for an arbitrary project in that repo which imports from azure-core:
cd sdk/keyvault/azure-keyvault-secrets
pip install -r dev_requirements.txt -e . --config-settings editable-mode=compat

In your virtual environment, you should see a structure like the following:

__editable__.azure_keyvault_secrets-4.10.1.pth
azure/
    core/
        ...  # installed files from this package
    keyvault/
        __init__.py  # nothing else, editably installed
  1. Configure your editor to use the latest Pyrefly release as its language server.
  2. Open sdk/keyvault/azure-keyvault-secrets/azure/keyvault/secrets/_client.py.

Now we have the same result as Case 1: imports from azure.core are Unknown, but this time for a different reason. In this situation, azure.core is considered to be a NamespacePackage by find_one_part, not a RegularPackage as it was in Case 1. As a result, even with our change in Case 1, azure.core isn't accumulated, so it is never searched. The solution here is to append any NamespacePackage accumulations to the end of the RegularPackage accumulations before returning from find_one_part.

Once compiled, we see that the imports can now be found:

image

Test Plan

  • finder.rs unit tests and functional evidence screenshots from IDE

@meta-cla
Copy link

meta-cla bot commented Mar 11, 2026

Hi @mdbernard!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks!

@mdbernard mdbernard force-pushed the pkgutil-namespace-packages branch from ea7c502 to f2d98bc Compare March 11, 2026 05:11
@mdbernard mdbernard changed the title resolve symbols defined across pkgutil-style namespace packages close #923 support pkgutil-style namespace packages Mar 11, 2026
@meta-cla
Copy link

meta-cla bot commented Mar 11, 2026

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@meta-cla meta-cla bot added the cla signed label Mar 11, 2026
@mdbernard mdbernard force-pushed the pkgutil-namespace-packages branch from f2d98bc to e8326ed Compare March 14, 2026 03:45
@mdbernard mdbernard changed the title close #923 support pkgutil-style namespace packages close facebook#923 support pkgutil-style namespace packages Mar 14, 2026
@mdbernard mdbernard changed the title close facebook#923 support pkgutil-style namespace packages close #923 support pkgutil-style namespace packages Mar 14, 2026
@mdbernard mdbernard force-pushed the pkgutil-namespace-packages branch from e8326ed to f3b37ca Compare March 16, 2026 07:54
@mdbernard mdbernard force-pushed the pkgutil-namespace-packages branch from f3b37ca to 16a628e Compare March 16, 2026 08:32
@mdbernard mdbernard marked this pull request as ready for review March 16, 2026 09:09
Copy link
Contributor

@connernilsen connernilsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mdbernard, thanks for doing this. I think we may still be discussing if/how we want to implement this, but I have a few questions I'm curious about. I haven't looked into this in a while, so my memory on how parts of this works are a little hazy. Let me know if any of my questions are worded poorly or don't represent what you're trying to achieve.

Comment on lines +44 to +47
/// root of the first field. To support legacy pkgutil.extend_path-style
/// namespace packages, the second field also contains the remaining roots
/// to search for matches if the first does not contain a match.
RegularPackage(PathBuf, Vec1<PathBuf>),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're adding support for namespace package extend path functionality, why is that happening in RegularPackage? It seems like that would be more appropriately added to NamespacePackage.

The reason I ask is that by design, the first result for a RegularPackage needs to take priority over all other results. Anything we can find afterward breaks the design of how a RegularPackage works in Python, and can cause Pyrefly to fail to error in some cases.

We can make exceptions to this rule, but I think I need to better understand this approach before we can decide on that. It might be that another approach we might want to take would be to make a new LegacyNamespacePackage(PathBuf, Vec1<PathBuf>), which has some of the information of a RegularPackage, but is treated as a NamespacePackage instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

My understanding here is that NamespacePackage is a shorthand for "implicit namespace package" defined in PEP 420. In the case of pkgutil-style packages, the namespaces aren't implicit, since they do use __init__.py files. Additionally, once pkgutil-style packages are installed, they look identical to regular packages in the site-packages folder.

For example, this pkgutil structure:

foo/
    foobar/
        __init__.py  # contains pkgutil.extend_path
        foo.py
bar/
    foobar/
        __init__.py  # contains pkgutil.extend_path
        bar.py

Ends up looking like this in site-packages when it's installed:

foobar/
    __init__.py
    foo.py
    bar.py

For both of these reasons, I thought it made more sense to treat these as RegularPackages instead of NamespacePackages, but I'm open to your suggestion about LegacyNamespacePackage being a distinct enum variant. Two questions though:

[...] the first result for a RegularPackage needs to take priority over all other results. Anything we can find afterward breaks the design of how a RegularPackage works in Python, and can cause Pyrefly to fail to error in some cases.

Do you happen to have an example of how this implementation fails? It's a bit unclear to me what scenario you're concerned about. I'd be happy to add more unit tests. In this implementation, we search for matches in all the roots, and then return the match with the highest priority (e.g. .pyi > .py > .pyc) (see the new test_namespace_package_submodule_prefers_pyi_over_py unit test for an example). This can be pretty easily changed to just return the match from the first root, regardless of that priority order though.

[...] we might want to take would be to make a new LegacyNamespacePackage(PathBuf, Vec1<PathBuf>), which has some of the information of a RegularPackage, but is treated as a NamespacePackage instead.

Currently, pkgutil-style packages are treated as RegularPackages by Pyrefly, since they look the same based on what the finder is looking for (dir name and __init__.py file presence). Do you have an idea of how you'd prefer this be done differently?

Some(FindResult::RegularPackage(_, existing_roots)) => {
// These __init__ files are lower-priority, so just accumulate the
// new roots after the existing roots in case we need to look there.
existing_roots.push(init_roots.into_vec().remove(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to remove the first entry in init_roots?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one is the folder containing the init_path, so we don't want to search that one again.

@mdbernard
Copy link
Author

Thanks for the review @connernilsen, I've replied to your feedback. I'd also be happy to jump on Discord to chat through it if that's easier!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants