Refactor RNGHierarchy to enable tying the random number generators to detectors#523
Refactor RNGHierarchy to enable tying the random number generators to detectors#523anand-avinash wants to merge 16 commits into
RNGHierarchy to enable tying the random number generators to detectors#523Conversation
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Hi Avinash, thanks ! So, if I understand correctly, for the non linearity case, where we want the state to be the same for all time divisions in a given detector, the communicator to be used should be comm_time_block, right? But, in that case, shouldn't we have |
|
Hi @mj-gomes! Yes you are right. Since we use |
…an MPI communicator as an argument
…ply suitable comm argument
|
We should probably also add a test in |
…t rng hierarchy, and add test for this feature in test_mpi.py
|
Hi @anand-avinash, I had to remove the rng_hierarchy handling in sim.apply_quadratic_nonlin, as it was always creating the rng hierarchy as previously defined, not using your changes. Now we have to pass user_seed to this method, to ensure it always works. I did a test in We have 2 detectors, 4 time samples, 2 MPI blocks of time, 1 MPI block of detectors. This means the following division: The seeding is now correct when I run the test locally in my pc (using mpirun -n 2). However we are getting |
|
Hi @mj-gomes!
Good catch!
If you look at the github action workflow for the tests, after installing |
|
Hi @anand-avinash,
Oh, right! Thanks, I did that. I also finished what was missing (updated the docs, merged master etc.). I think this is good to go. @anand-avinash @ziotom78 What do you think? |
The current
RNGHierarchyclass guarantees reproducible RNG hierarchy on a given MPI rank provided that the user provided seed, size of the MPI communicator, and number of detectors on that rank remain the same. This can be modified to pass the MPI communicator toRNGHierarchyinstead of the size of the communicator. The size of the communicator will be inferred from the communicator itself within the class during its instantiation, keeping the overall behavior of the class unchanged, when the user provides the global MPI communicator to this class.However, this updated interface can be used to provided same RNG for a given detector across different rank (or time blocks) if a suitable MPI communicator is supplied. Consider an example where 3 detector and their TODs are distributed across 2 detector blocks and 3 time blocks with a total 6 MPI processes:
The data distributions would look like the following:
With this distribution, the global communicator will be partitioned into three time block communicators accessible with
sim.observations[0].comm_time_block. The first one will contain ranks 0 and 3, second will contain ranks 1 and 4, and the third will contain ranks 2 and 5. As a result, since for ranks 0, 1, and 2 the size ofsim.observations[0].comm_time_blockand the number of detectors per rank are same,RNGHierarchywill produce exactly the same hierarchy with a common seed on these ranks. The same goes for ranks 3, 4, and 5. So, the detector level RNGs for detector A will be same across rank 0, 1, and 2; detector level RNGs for detector B will be same across ranks 0, 1, and 2; and detector level RNGs for detector C will be same across ranks 3, 4,and 5 -- guaranteeing that random numbers produced for given detector are same across different MPI ranks. This completely solves the issue raised with #510.Here I am showing the result of the print statements from the script I added above for RNG state verification:
Please take a look at this, if it seems suitable and sufficient, I would update the docstrings before merging this to master branch.