wwinp files: Fix MemoryError in WeightWindowsList.export_to_hdf5 and speed up from_wwinp. Alternative Approach#3951
wwinp files: Fix MemoryError in WeightWindowsList.export_to_hdf5 and speed up from_wwinp. Alternative Approach#3951yrrepy wants to merge 5 commits into
Conversation
Float/complex ndarrays are dtype-validated, so the per-element isinstance() scan is redundant. Also construct upper_ww_bounds in WeightWindows.__init__ as an ndarray multiplication (not a list comprehension) so the upper-bounds setter benefits too. ~11x speedup on 172M-element wwinp inputs (397 s -> 35 s).
The XML serialization raised MemoryError on bound arrays >~200M elements -- lxml's intermediate ASCII allocation fails before the text node can be built. Write HDF5 directly via h5py, mirroring the C++ WeightWindows::to_hdf5 writer. Critical details for C++ compatibility: - Bounds are 2D (ne, n_voxels) on disk (4D would segfault the C++ tensor::Tensor<double> reader). - max_lower_bound_ratio is written unconditionally (default 1.0). - Root attrs filetype and version are required by openmc_weight_windows_import. Adds Mesh.to_hdf5 on each structured mesh subclass, mirroring the existing Mesh.to_xml_element pattern. UnstructuredMesh raises NotImplementedError (wwinp cannot produce one).
The dtype-trust fast path returned for any float/complex ndarray of matching depth, even when expected_type was int or another class -- the docstring promised element-type validation but the fast path skipped it. Gate the fast path on expected_type in (Real, float, complex) so it only fires when dtype.kind in 'fc' actually satisfies the contract.
The direct-h5py writer cannot serialize an UnstructuredMesh from pure Python: vertex and connectivity data live in the external .exo/.h5m file and only exist in memory after LibMesh/MOAB loads them via openmc.lib.init. Dispatch on mesh type up front: structured meshes take the new fast path; UnstructuredMesh falls back to the previous TemporarySession + openmc.lib.export_weight_windows route, which also restores honoring of init_kwargs on that path. Removes the dead NotImplementedError branch from _write_mesh_group.
| if (isinstance(value, np.ndarray) and value.dtype.kind in 'fc' | ||
| and min_depth <= value.ndim <= max_depth | ||
| and expected_type in (Real, float, complex)): | ||
| return |
There was a problem hiding this comment.
I think you should separate the complex and float cases.
Because if someone expects a Real iterable and get numpy array with complex dtype it should error out.
Or maybe we should get rid altogether of the complex case. I do not know of a place in openmc when we use complex numbers. Do you know of a use case?
| else: | ||
| raise ValueError('Unrecognized mesh type: "' + mesh_type + '"') | ||
|
|
||
| def to_hdf5(self, group: h5py.Group) -> h5py.Group: |
There was a problem hiding this comment.
This should be declared as an abstractmethod. That way it is clear that each mesh type should implement this method.
| return mesh | ||
|
|
||
| def to_hdf5(self, group: h5py.Group): | ||
| # Raise before super() so no half-built 'mesh <id>' group is left on disk. |
There was a problem hiding this comment.
There is no super afterwards so I think this comment is not needed.
| # Any unstructured mesh forces the whole list onto the lib fallback. | ||
| if any(isinstance(ww.mesh, UnstructuredMesh) for ww in self): | ||
| import openmc.lib | ||
| model = openmc.Model() | ||
| sph = openmc.Sphere(boundary_type='vacuum') | ||
| model.geometry = openmc.Geometry([openmc.Cell(region=-sph)]) | ||
| model.settings.weight_windows = self | ||
| model.settings.particles = 100 | ||
| model.settings.batches = 1 | ||
| with openmc.lib.TemporarySession(model, **init_kwargs): | ||
| openmc.lib.export_weight_windows(path) | ||
| return |
There was a problem hiding this comment.
I think a cleaner solution will be to implement UnstructuredMesh.to_hdf5 using openmc.lib.export_weight_windows under the hood.
That should simplify the logic here.
What do you think?
Closed #3942
in favor of the alternative approach,
switching to the per-mesh
Mesh.to_hdf5design, each mesh subclass serializes itself, mirroring the existingto_xml_elementsuper-call pattern) instead of a central dispatch helper.This PR:
This enables support for many-GB wwinp files and faster processing of them.
Checklist