Skip to content

wwinp files: Fix MemoryError in WeightWindowsList.export_to_hdf5 and speed up from_wwinp. Alternative Approach#3951

Open
yrrepy wants to merge 5 commits into
openmc-dev:developfrom
yrrepy:wwinp_2GB_faster
Open

wwinp files: Fix MemoryError in WeightWindowsList.export_to_hdf5 and speed up from_wwinp. Alternative Approach#3951
yrrepy wants to merge 5 commits into
openmc-dev:developfrom
yrrepy:wwinp_2GB_faster

Conversation

@yrrepy
Copy link
Copy Markdown
Contributor

@yrrepy yrrepy commented May 30, 2026

Closed #3942
in favor of the alternative approach,
switching to the per-mesh Mesh.to_hdf5 design, each mesh subclass serializes itself, mirroring the existing to_xml_element super-call pattern) instead of a central dispatch helper.

This PR:

  1. Fixes: openmc.WeightWindowsList.from_wwinp('wwinp') fails when processing wwinp files that are larger than ~2GB
  2. Speeds-up WeightWindowsList processing (which gets to be slow with multi-GB wwinp).

This enables support for many-GB wwinp files and faster processing of them.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 18) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

yrrepy added 5 commits May 22, 2026 14:02
Float/complex ndarrays are dtype-validated, so the per-element
isinstance() scan is redundant. Also construct upper_ww_bounds in
WeightWindows.__init__ as an ndarray multiplication (not a list
comprehension) so the upper-bounds setter benefits too. ~11x speedup
on 172M-element wwinp inputs (397 s -> 35 s).
The XML serialization raised MemoryError on bound arrays >~200M
elements -- lxml's intermediate ASCII allocation fails before the
text node can be built. Write HDF5 directly via h5py, mirroring
the C++ WeightWindows::to_hdf5 writer.

Critical details for C++ compatibility:
- Bounds are 2D (ne, n_voxels) on disk (4D would segfault the
  C++ tensor::Tensor<double> reader).
- max_lower_bound_ratio is written unconditionally (default 1.0).
- Root attrs filetype and version are required by
  openmc_weight_windows_import.

Adds Mesh.to_hdf5 on each structured mesh subclass, mirroring the
existing Mesh.to_xml_element pattern. UnstructuredMesh raises
NotImplementedError (wwinp cannot produce one).
The dtype-trust fast path returned for any float/complex ndarray of
matching depth, even when expected_type was int or another class --
the docstring promised element-type validation but the fast path
skipped it. Gate the fast path on expected_type in (Real, float,
complex) so it only fires when dtype.kind in 'fc' actually satisfies
the contract.
The direct-h5py writer cannot serialize an UnstructuredMesh from pure
Python: vertex and connectivity data live in the external .exo/.h5m
file and only exist in memory after LibMesh/MOAB loads them via
openmc.lib.init. Dispatch on mesh type up front: structured meshes
take the new fast path; UnstructuredMesh falls back to the previous
TemporarySession + openmc.lib.export_weight_windows route, which also
restores honoring of init_kwargs on that path.

Removes the dead NotImplementedError branch from _write_mesh_group.
Comment thread openmc/checkvalue.py
Comment on lines +93 to +96
if (isinstance(value, np.ndarray) and value.dtype.kind in 'fc'
and min_depth <= value.ndim <= max_depth
and expected_type in (Real, float, complex)):
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should separate the complex and float cases.
Because if someone expects a Real iterable and get numpy array with complex dtype it should error out.

Or maybe we should get rid altogether of the complex case. I do not know of a place in openmc when we use complex numbers. Do you know of a use case?

Comment thread openmc/mesh.py
else:
raise ValueError('Unrecognized mesh type: "' + mesh_type + '"')

def to_hdf5(self, group: h5py.Group) -> h5py.Group:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be declared as an abstractmethod. That way it is clear that each mesh type should implement this method.

Comment thread openmc/mesh.py
return mesh

def to_hdf5(self, group: h5py.Group):
# Raise before super() so no half-built 'mesh <id>' group is left on disk.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no super afterwards so I think this comment is not needed.

Comment thread openmc/weight_windows.py
Comment on lines +1084 to +1095
# Any unstructured mesh forces the whole list onto the lib fallback.
if any(isinstance(ww.mesh, UnstructuredMesh) for ww in self):
import openmc.lib
model = openmc.Model()
sph = openmc.Sphere(boundary_type='vacuum')
model.geometry = openmc.Geometry([openmc.Cell(region=-sph)])
model.settings.weight_windows = self
model.settings.particles = 100
model.settings.batches = 1
with openmc.lib.TemporarySession(model, **init_kwargs):
openmc.lib.export_weight_windows(path)
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a cleaner solution will be to implement UnstructuredMesh.to_hdf5 using openmc.lib.export_weight_windows under the hood.
That should simplify the logic here.

What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants