Skip to content

Conversation

@tennlee
Copy link
Collaborator

@tennlee tennlee commented Mar 27, 2025

Fix resultant circular import issues

I ran isort and black
Instead of importing from the API as defined in init.py, I imported things directly out of the fully-qualified namespace for the path they were actually declared in
I'm not sure if this was the best approach but it worked fine

@tennlee tennlee requested review from jennan and nikeethr March 27, 2025 08:40
@coveralls
Copy link

coveralls commented Mar 27, 2025

Pull Request Test Coverage Report for Build 14108149083

Details

  • 494 of 503 (98.21%) changed or added relevant lines in 174 files are covered.
  • 39 unchanged lines in 4 files lost coverage.
  • Overall coverage decreased (-0.2%) to 54.82%

Changes Missing Coverage Covered Lines Changed/Added Lines %
packages/data/src/pyearthtools/data/catalog.py 7 8 87.5%
packages/pipeline/src/pyearthtools/pipeline/controller.py 9 10 90.0%
packages/data/src/pyearthtools/data/download/ecmwf_opendata/opendata.py 6 8 75.0%
packages/pipeline/src/pyearthtools/pipeline/modifications/idx_modification.py 7 9 77.78%
packages/pipeline/src/pyearthtools/pipeline/parallel.py 4 7 57.14%
Files with Coverage Reduction New Missed Lines %
packages/data/src/pyearthtools/data/patterns/zarr.py 6 21.43%
packages/pipeline/src/pyearthtools/pipeline/operations/xarray/remapping/base.py 10 21.05%
packages/utils/src/pyearthtools/utils/config.py 10 62.12%
packages/pipeline/src/pyearthtools/pipeline/operations/xarray/remapping/healpix.py 13 9.45%
Totals Coverage Status
Change from base Build 14099993677: -0.2%
Covered Lines: 7311
Relevant Lines: 12881

💛 - Coveralls

@nikeethr
Copy link
Collaborator

not to do with this PR (directly), but what this auto_import() is doing is quite flakey in my view. I don't know if there is sufficient checks to enforce a certain structure on what's being imported. I'm just raising it because it is a candidate for conflict since anything dynamically imported this way is a black box with its own import tree that may refer to packages in this repo...

@nikeethr
Copy link
Collaborator

Is there anything in particular I should be looking at from a reviewer stand point? There are a lot of things changed so I'm having trouble deciphering which ones are solving circular import issues and which ones are just isort doing its formatting.

@tennlee
Copy link
Collaborator Author

tennlee commented Mar 27, 2025

Let me reflect a little, and I'll add some specific questions here and you can respond to those.

@tennlee
Copy link
Collaborator Author

tennlee commented Mar 27, 2025

I might be able to do some fancy footwork to separate out the automated formatting from the fixing...

@tennlee tennlee force-pushed the fix_data_circulars branch 2 times, most recently from c742b1d to 6dae2ff Compare March 27, 2025 10:22
@tennlee
Copy link
Collaborator Author

tennlee commented Mar 27, 2025

Okay, this commit isolates the actual fixing --> here . It may still be hard to follow and tackle more than one concern, but it at least separates out the reformatting commit.

@nikeethr
Copy link
Collaborator

@tennlee thanks for that - what you have done looks fine to me and makes sense.

import pyearthtools.data.logger # Initialise data logger

try:
DEFAULT_EXTENSION = pyearthtools.utils.config.get("data.patterns.default_extension")
Copy link
Collaborator

@nikeethr nikeethr Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this was the existing code, but this can probably directly be called where its used (or using a helper function so that it is bound to the function during runtime rather than executed on import)

e.g.

# global default
DEFAULT_EXTENSION = ".nc"

def some_fn(..., ext=None):
    if ext is None:
        # if ext is not explicitly specified, try get from config, otherwise default to global
        xxx.config.get("xxx.default_extension", DEFAULT_EXTENSION)
    ...  # usual execution flow

Copy link
Collaborator

@nikeethr nikeethr Mar 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto everywhere else this happens in this PR (ideally the entire codebase eventually - but out of scope for this PR)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's a good idea. I think I will leave it be for this PR, but I'll convert this comment into a new issue. Thanks!

@nikeethr
Copy link
Collaborator

nikeethr commented Mar 27, 2025

I just had a few minor comments. Otherwise I think its fine as long as no functionality is broken. The following are just general observations and follow ups - not necessarily to do with evaluating this PR in particular.

As an aside probably from a place of ignorance since I'm not familiar with "exports" (i.e. publicly "visible" imports) on __init__.py. Going through https://peps.python.org/pep-0008/#public-and-internal-interfaces

It is quite apparent that the separating factor in terms of what should exist in __init__ is what goes in __all__ and conversely everything that doesn't go in __all__ should ideally adhere to the following recommendation.

Even with __all__ set appropriately, internal interfaces (packages, modules, classes, functions, attributes or other names) should still be prefixed with a single leading underscore.

As a final thought, circular imports are likely due to a sub-optimal structural design of the codebase - and while the work in this PR does a good step in fixing it, the monolithic nature of the import statements in the various modules isn't necessarily helpful in preventing this scenario from occurring again. If a public API needs to be re-used internally, the first question that comes to my mind is why isn't there an internal representation of this functionality instead? (after all a public API is just a candy wrapper. The goods - actual functional logic - can still be internal to the wrapper.)

The answer actually is that there IS internal representation - just that the nesting and repetition in naming the module imports make it less obvious. Trivial as it may seem, a leading underscore actually is a striking visible separation that allows someone looking at an internal module to decipher whether it is using something internal or (incorrectly) using the public representation.

As a follow up I think it would be prudent to restrict any internal only functions to recommended naming conventions, mentioned above as well as taking into consideration the following FAQ: https://docs.python.org/3/faq/programming.html#what-are-the-best-practices-for-using-import-in-a-module

@tennlee
Copy link
Collaborator Author

tennlee commented Apr 1, 2025

I'm going to re-do this as some smaller more modular pull requests, but duplicating the approach.

@tennlee tennlee closed this Apr 1, 2025
@tennlee tennlee deleted the fix_data_circulars branch April 1, 2025 10:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants