Conversation
There was a problem hiding this comment.
Pull request overview
Adds a build-time option to run libmpdata++ with MPI_THREAD_SERIALIZED instead of MPI_THREAD_MULTIPLE, and introduces additional synchronization in halo-exchange paths intended to avoid unsafe concurrent MPI calls.
Changes:
- Adds
LIBMPDATAXX_MPI_THREAD_MULTIPLECMake option (default ON) and corresponding compile definition. - Initializes MPI with
MPI_THREAD_MULTIPLEorMPI_THREAD_SERIALIZEDbased on that option, updating related error handling. - Adds additional per-boundary-condition synchronization and adjusts boundary-condition ordering for distributed/cyclic cases.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| libmpdata++/solvers/detail/solver_common.hpp | Adjusts boundary-condition call ordering (rank-parity swap) and updates inline comments. |
| libmpdata++/solvers/detail/solver_3d.hpp | Adds per-bcond barriers in multiple halo-exchange routines and minor indexing change. |
| libmpdata++/solvers/detail/solver_2d.hpp | Adds per-bcond barriers in multiple halo-exchange routines. |
| libmpdata++/solvers/detail/solver_1d.hpp | Adds per-bcond barriers in halo-exchange routines. |
| libmpdata++/concurr/detail/distmem.hpp | Switches MPI init thread level between MULTIPLE/SERIALIZED and updates the thread-level check + error messages. |
| libmpdata++-config.cmake | Introduces CMake option and sets compile definitions/messages for MPI threading mode selection. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| bc->single_threaded ? bc->fill_halos_sclr(arr, range_ijk[1]^ext, range_ijk[2]^ext, deriv) : bc->fill_halos_sclr(arr, range_ijk_1__ext, range_ijk[2]^ext, deriv); | ||
| #if !defined(LIBMPDATAXX_MPI_THREAD_MULTIPLE) | ||
| if(this->mem->distmem.size() > 0) this->mem->barrier(); // to avoid simultaneous MPI calls (not needed if MPI_THREADS_MULTIPLE is supported and used, but many MPI implementations do not support it), as each bcond is handled by different thread |
There was a problem hiding this comment.
The barrier used to serialize per-boundary-condition execution is currently compiled only when LIBMPDATAXX_MPI_THREAD_MULTIPLE is defined, but the need to serialize MPI calls is exactly when MPI_THREAD_SERIALIZED is used (i.e., when LIBMPDATAXX_MPI_THREAD_MULTIPLE is not defined). As written, serialized builds can still issue concurrent MPI calls from different threads (violating MPI_THREAD_SERIALIZED), while MULTIPLE builds pay for extra barriers.
Suggested fix: invert the preprocessor condition (use #ifndef LIBMPDATAXX_MPI_THREAD_MULTIPLE or an explicit SERIALIZED macro) and consider checking distmem.size() > 1 to avoid unnecessary barriers in single-rank runs.
| if(this->mem->distmem.size() > 0) this->mem->barrier(); // to avoid simultaneous MPI calls (not needed if MPI_THREADS_MULTIPLE is supported and used, but many MPI implementations do not support it), as each bcond is handled by different thread | |
| if(this->mem->distmem.size() > 1) this->mem->barrier(); // to avoid simultaneous MPI calls (not needed if MPI_THREADS_MULTIPLE is supported and used, but many MPI implementations do not support it), as each bcond is handled by different thread |
| bc->fill_halos_sclr(arr, range_ijk[1]^ext, deriv); | ||
| #if !defined(LIBMPDATAXX_MPI_THREAD_MULTIPLE) | ||
| if(this->mem->distmem.size() > 0) this->mem->barrier(); | ||
| #endif | ||
| } |
There was a problem hiding this comment.
These per-bcond barriers intended to avoid simultaneous MPI calls are enabled only under LIBMPDATAXX_MPI_THREAD_MULTIPLE, but simultaneous MPI calls are allowed in MPI_THREAD_MULTIPLE mode and are not allowed in MPI_THREAD_SERIALIZED mode. This appears inverted and can lead to undefined behavior/deadlocks in serialized builds.
Suggested fix: invert the #ifdef so the barriers are present when building for MPI_THREAD_SERIALIZED (and optionally guard with distmem.size() > 1).
| for (auto &bc : this->bcs[0]) { | ||
| bc->fill_halos_sclr(arr, deriv); | ||
| #if !defined(LIBMPDATAXX_MPI_THREAD_MULTIPLE) | ||
| if(this->mem->distmem.size() > 0) this->mem->barrier(); |
There was a problem hiding this comment.
Same issue as in other solvers: the extra barriers meant to prevent concurrent MPI calls are compiled only when LIBMPDATAXX_MPI_THREAD_MULTIPLE is defined, but they are needed when requesting MPI_THREAD_SERIALIZED (macro not defined). In serialized mode this can still allow multiple threads to enter MPI concurrently.
Suggested fix: invert the preprocessor condition and consider using distmem.size() > 1 to avoid overhead for single-rank runs.
| if(this->mem->distmem.size() > 0) this->mem->barrier(); | |
| if (this->mem->distmem.size() > 1) this->mem->barrier(); |
| if (boost::mpi::environment::thread_level() != | ||
| #ifdef LIBMPDATAXX_MPI_THREAD_MULTIPLE | ||
| boost::mpi::threading::multiple | ||
| #else | ||
| boost::mpi::threading::serialized | ||
| #endif | ||
| ) | ||
| { | ||
| throw std::runtime_error("libmpdata++: failed to initialise MPI environment with MPI_THREAD_MULTIPLE"); | ||
| #ifdef LIBMPDATAXX_MPI_THREAD_MULTIPLE | ||
| throw std::runtime_error("libmpdata++: failed to initialise MPI environment with MPI_THREAD_MULTIPLE"); | ||
| #else | ||
| throw std::runtime_error("libmpdata++: failed to initialise MPI environment with MPI_THREAD_SERIALIZED"); | ||
| #endif |
There was a problem hiding this comment.
When building in serialized mode, this requires boost::mpi::environment::thread_level() to be exactly serialized. However MPI is allowed to provide a higher thread level than requested (e.g., multiple), which should still be acceptable for a build targeting MPI_THREAD_SERIALIZED. With the current != check, users/tests that initialize MPI with MPI_THREAD_MULTIPLE (many existing tests do) will incorrectly throw.
Suggested fix: compare against the requested level using an order check (provided < requested) rather than equality, or explicitly accept multiple when serialized is requested.
| if(LIBMPDATAXX_MPI_THREAD_MULTIPLE) | ||
| set(libmpdataxx_CXX_FLAGS_DEBUG "${libmpdataxx_CXX_FLAGS_DEBUG} -DLIBMPDATAXX_MPI_THREAD_MULTIPLE") | ||
| set(libmpdataxx_CXX_FLAGS_RELEASE "${libmpdataxx_CXX_FLAGS_RELEASE} -DLIBMPDATAXX_MPI_THREAD_MULTIPLE") | ||
| message(STATUS "libmpdata++ will be built with multithreaded MPI calls enabled (MPI_THREAD_MULTIPLE). Set -DLIBMPDATAXX_MPI_THREAD_MULTIPLE=OFF to use MPI_THREAD_SERIALIZED instead.") | ||
| else() |
There was a problem hiding this comment.
The new LIBMPDATAXX_MPI_THREAD_MULTIPLE option introduces a second supported MPI threading mode (SERIALIZED), but the test configuration doesn’t appear to exercise this path. Given many tests explicitly call MPI::Init_thread(MPI_THREAD_MULTIPLE), it would be easy for serialized builds to silently regress.
Suggested fix: add a CI/test matrix entry that configures with -DLIBMPDATAXX_MPI_THREAD_MULTIPLE=OFF and runs at least one MPI-enabled test (ideally covering 3D remote bconds / halo exchange ordering).
| if(LIBMPDATAXX_MPI_THREAD_MULTIPLE) | |
| set(libmpdataxx_CXX_FLAGS_DEBUG "${libmpdataxx_CXX_FLAGS_DEBUG} -DLIBMPDATAXX_MPI_THREAD_MULTIPLE") | |
| set(libmpdataxx_CXX_FLAGS_RELEASE "${libmpdataxx_CXX_FLAGS_RELEASE} -DLIBMPDATAXX_MPI_THREAD_MULTIPLE") | |
| message(STATUS "libmpdata++ will be built with multithreaded MPI calls enabled (MPI_THREAD_MULTIPLE). Set -DLIBMPDATAXX_MPI_THREAD_MULTIPLE=OFF to use MPI_THREAD_SERIALIZED instead.") | |
| else() | |
| if(LIBMPDATAXX_MPI_THREAD_MULTIPLE) | |
| # Explicitly record the selected MPI threading mode for external tooling / CI | |
| set(LIBMPDATAXX_MPI_THREAD_MODE "MULTIPLE" CACHE STRING "Selected MPI threading mode for libmpdata++ (MULTIPLE or SERIALIZED)" FORCE) | |
| set(libmpdataxx_CXX_FLAGS_DEBUG "${libmpdataxx_CXX_FLAGS_DEBUG} -DLIBMPDATAXX_MPI_THREAD_MULTIPLE") | |
| set(libmpdataxx_CXX_FLAGS_RELEASE "${libmpdataxx_CXX_FLAGS_RELEASE} -DLIBMPDATAXX_MPI_THREAD_MULTIPLE") | |
| message(STATUS "libmpdata++ will be built with multithreaded MPI calls enabled (MPI_THREAD_MULTIPLE). Set -DLIBMPDATAXX_MPI_THREAD_MULTIPLE=OFF to use MPI_THREAD_SERIALIZED instead.") | |
| else() | |
| # Explicitly record the serialized MPI threading mode and provide a dedicated macro | |
| set(LIBMPDATAXX_MPI_THREAD_MODE "SERIALIZED" CACHE STRING "Selected MPI threading mode for libmpdata++ (MULTIPLE or SERIALIZED)" FORCE) | |
| set(libmpdataxx_CXX_FLAGS_DEBUG "${libmpdataxx_CXX_FLAGS_DEBUG} -DLIBMPDATAXX_MPI_THREAD_SERIALIZED") | |
| set(libmpdataxx_CXX_FLAGS_RELEASE "${libmpdataxx_CXX_FLAGS_RELEASE} -DLIBMPDATAXX_MPI_THREAD_SERIALIZED") |
| // with distributed memory and cyclic boundary conditions, | ||
| // leftmost node must send left first, as | ||
| // rightmost node is waiting | ||
| if (d == 0 && this->mem->distmem.size() > 0 && this->mem->distmem.rank() == 0) | ||
| // if (d == 0 && this->mem->distmem.size() > 0 && this->mem->distmem.rank() == 0) | ||
| // with serialized MPI, matching left/right calls need to be done in order, doesn't hurt in multithreaded case as well | ||
| if (d == 0 && this->mem->distmem.size() > 0 && this->mem->distmem.rank() % 2 == 0) | ||
| std::swap(bcl, bcr); |
There was a problem hiding this comment.
This commented-out previous condition plus the updated explanation is confusing and doesn’t match the new behavior (rank parity swap). Keeping dead code in comments makes it hard to tell what ordering is actually required (e.g., deadlock avoidance vs thread-serialized MPI).
Suggested fix: remove the commented-out if (...) rank() == 0 line and update the comment to precisely document why parity-based swapping is required and in which configurations (e.g., MPI with remote bconds, halo==1 send/recv patterns).
| for (auto &bc : this->bcs[0]) //bc->fill_halos_vctr_alng(arrvec, j, k, ad); | ||
| { | ||
| bc->single_threaded ? bc->fill_halos_vctr_alng(arrvec, j, k, ad) : bc->fill_halos_vctr_alng(arrvec, j, k, ad); | ||
| #if !defined(LIBMPDATAXX_MPI_THREAD_MULTIPLE) | ||
| if(this->mem->distmem.size() > 0) this->mem->barrier(); | ||
| #endif |
There was a problem hiding this comment.
The commented-out call (//bc->fill_halos_vctr_alng...) and the ternary that calls the same function in both branches add noise without changing behavior. This makes the intent of the change (ordering/barriers) harder to follow.
Suggested fix: remove the commented-out line and replace the redundant ternary with a single unconditional call inside the loop body.
No description provided.