Skip to content

Conversation

@PlethoraChutney
Copy link
Contributor

Hello! Thank you for MDAnalysis! This PR changes how MRC files are checked before being loaded as a 3D grid. Rather than use the is_volume() method from mrcfile we instead use a simpler shape check.

The reason for this is that mrcfile returns False if the ISPG header field of a 3D MRC file is zero. While this is correct according to the spec, most cryoEM software packages save a 0 in this field for a variety of reasons. Because of this unwritten deviation from spec, most packages also read in files with a 0 space group as 3D Volumes anyway, but GridDataFormats does not.

Despite the fact that this is formally correct, I'm proposing that MDAnalysis should follow the behavior that users likely expect from their experience with other packages.

This PR fails three tests, but on my machine at least the current master branch also fails these same tests.

  • gridData/tests/test_grid.py::TestGrid::test_load_wrong_fileformat[ccp4]
  • gridData/tests/test_grid.py::TestGrid::test_load_wrong_fileformat[plt]
  • gridData/tests/test_grid.py::TestGrid::test_load_wrong_fileformat[dx]

@codecov
Copy link

codecov bot commented Jan 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.78%. Comparing base (abd0feb) to head (3bfd503).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #149      +/-   ##
==========================================
+ Coverage   87.70%   87.78%   +0.08%     
==========================================
  Files           5        5              
  Lines         748      753       +5     
  Branches       97       99       +2     
==========================================
+ Hits          656      661       +5     
  Misses         56       56              
  Partials       36       36              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@orbeckst
Copy link
Member

@PlethoraChutney thank you for your contribution and the insights. Could you please raise an issue that describes the issue — basically your justification for the fix? Separating solution from issues is good practice when we're discussing changes that may go against specs. Issues are our permanent record for decisions.

We can then link the PR to the issue.

Once you have the issue up, we can look closer at your proposed solution. Thank you!

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocked until we have discussed the approach.

@BradyAJohnston
Copy link
Member

It's related to this issue (BradyAJohnston/MolecularNodes#993) raised in MN. Downloading directly from the EMDB it would load fine, but if we round-trip through ChimeraX then it would fail to load with Grid() because of the points specified above.

Worth some proper discussion though!

@orbeckst
Copy link
Member

Thanks for the context — ideally we collect all of this in an issue and then we have one place that documents the decision and the reasoning, as opposed to potentially mixing it with implementation details etc.

@PlethoraChutney
Copy link
Contributor Author

Apologies! I'm relatively new to actually contributing to others' repos. Created #150 for discussion

@orbeckst
Copy link
Member

Many thanks @PlethoraChutney , #150 is perfect.

Different projects have different "cultures" on how they do development and when coming in you just don't know.

@PlethoraChutney
Copy link
Contributor Author

Tried to capture the discussion in #150

  • Grid() now takes a **kwargs parameter which is passed to all load functions. They all ignore it except MRC
  • MRC() takes load_mrc: None | bool. By default (None), mrcfile.is_volume() is used. If a bool is provided, it forces the file to be a volume (True) or a 2D image stack (False)
  • Added tests to cover this behavior

gridData/core.py Outdated
def __init__(self, grid=None, edges=None, origin=None, delta=None,
metadata=None, interpolation_spline_order=3,
file_format=None):
file_format=None, **kwargs):
Copy link
Member

@BradyAJohnston BradyAJohnston Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer an actual argument rather than **kwargs as the latter can easily (and silently) absorb spelling mistakes fiel_format='MRC would be silently no longer passed on instead of raising an error otherwise. I defer to @orbeckst on this but I think just mentioning in the docs that it is only used for mrc files (and maybe a warning when used for non-compatible format?).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that **kwargs can silently swallow typos. On the other hand, tracking individual modifier keywords for different parsers can get annoying, at least in principle. In MDAnalysis we've typically been using the approach of named keywords for arguments that universally apply and kwargs for modifiers.

GDF really does not see a flurry of development activity so the maintainability argument that I am raising is not particularly strong and I do see the value of being explicit and more robust for users. So then let's follow your idea and make is_volume a top-level kwarg of Grid, document it there and in MRC (with a proper versionadded), and propagate it explicitly.

Copy link
Member

@BradyAJohnston BradyAJohnston Jan 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel strongly about it mostly because I am endlessly spending hours bug fixing when it was just me with simple spelling mistakes.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making the changes. Please see comments.

Please also

  • add an entry to CHANGELOG under Enhancements for 1.1.0
  • add yourself to AUTHORS
  • check that the docs render correctly (follow the link under the docs deployment check)

EDIT: see also discussion below to use assume_volumetric instead of is_volume.

@PlethoraChutney
Copy link
Contributor Author

I'll try to put this all together later today. I was thinking though, it feels odd to have Grid(filename, is_volume=False) be a more verbose raise ValueError. How would we feel about changing to coerce_volumetric: bool default False?

@BradyAJohnston
Copy link
Member

Maybe not coerce but assume_volumetric ? We aren't changing the data at all just changing our assumptions about it.

@orbeckst
Copy link
Member

We can use assume_volumetric=False as default (as before) and True to do the same as is_volume=True. (None will just do the same as False, which is fine)

@PlethoraChutney
Copy link
Contributor Author

OK, latest commit changed to an explicit assume_volumetric parameter, and docs look alright to me.

Copy link
Member

@orbeckst orbeckst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good — I just want to make changes to how we pass arguments from load() to _load_xxx() private methods.

@orbeckst orbeckst merged commit fe90a65 into MDAnalysis:master Jan 15, 2026
10 checks passed
@orbeckst
Copy link
Member

Thank you for your contribution @PlethoraChutney — nicely done.

Thank you for your great comments @BradyAJohnston !

orbeckst added a commit that referenced this pull request Jan 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GridDataFormats refuses to load 3D MRC files with space group 0

3 participants