-
Notifications
You must be signed in to change notification settings - Fork 17
Fix issue #364 #372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix issue #364 #372
Conversation
|
@anand-avinash, I put together some fixes that might solve the problems you described in #364. Before merging this PR, could you please check that this is indeed the case? |
|
Hi @ziotom78! Thank you applying the fixes. The problem is not completely resolved yet. I am now getting two new (but related) errors: The assignment of I am using the same script as this for testing. |
|
I found that the issue is limited primarily to the following section: litebird_sim/litebird_sim/observations.py Lines 701 to 708 in dd4d052
This section is applicable only to the MPI processes that do not contain any detector. On such MPI processes, in this section, we first acquire the default value of the attribute ( Also since If I modify this section as following, it resolves the two errors completely: After this, I get two errors, I will report them one-by-one: 1st error This is coming from rank 1 that in fact contains a detector(!): In the 2nd error This error is again due to the section I referred before. Since the default value of |
|
Are you sure you still use the script mentioned in that comment? The issue triggers line 52 in pointings_in_obs.py (last line of the stack trace), which is only called when Could you please modify the test Many thanks! |
|
Ah you are right! I was running the script after uncommenting the set_hwp part. With that commented, there is no error from the script. However, it throws some if I call destriper. |
| base_path=base_path, | ||
| start_time=0.0, | ||
| duration_s=mission_time_days * 24 * 60.0, | ||
| duration_s=mission_time_days * 24 * 60 * 60.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this needed? It makes the test far slower…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no need, I agree with you. I just wanted make the conversion to seconds correctly.
This should work even if some MPI processes have no detector
As stated at the beginning of the file, we do not use PyTest for MPI processes, as we want to be sure that all processes share the same temporary directory, and this is critical for I/O tests.
The reason is that if some MPI process crashes, the barrier fails if not every MPI process is in the grid.
We use a StringIO stream to write things in memory, so we can test print() statements without polluting the terminal
|
Tests now fail because function if mueller_hwp is not None:
assert tod.shape[0] == mueller_hwp.shape[0]where I am unsure why this error is happening or how we can fix it. I see a few possibilities:
|
|
I am thinking about a different resolution for #364. The problem arises simply because we try to use more number of MPI processes than the |
I like this idea! The assumption that every MPI process had at least one observation is so pervasive that fixing it is taking too much time. A script that helps the user find the right number of processes is easier to understand, does not complicate the codebase, and makes better use of computer resources. |
This commit attempts to fix issue #364 by revisiting the way MPI processes are used in several places in the framework. It adds a new test that checks that a simple E2E pipeline can run even if one of the MPI process has no TOD samples in it.
_GridCommClass.is_this_process_in_grid()Observation.no_mueller_hwp()Observation.no_mueller_hwp()