Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
278 commits
Select commit Hold shift + click to select a range
851de6a
Remove prints
adityapb Nov 4, 2024
2e93e4c
Write hostfile
adityapb Nov 4, 2024
807540a
Add shrink expand back to TreeLB
adityapb Nov 6, 2024
d833c88
Remove writing new nodelist file temporarily
adityapb Nov 8, 2024
e40cfd0
Fix pe assignment
adityapb Nov 8, 2024
b409b28
Add debugging
adityapb Nov 9, 2024
6eccf95
Add more debugs
adityapb Nov 9, 2024
2b16673
More debugging
adityapb Nov 9, 2024
c3e195a
Remove prints from example
adityapb Nov 10, 2024
bb61022
Add rescale function and fix metis to switch off PEs
adityapb Jan 26, 2025
3528371
Change load balancing order in shrink/expand
adityapb Jan 30, 2025
3a2b3d2
Remove debugging prints
adityapb Jan 31, 2025
1fe3175
Remove more prints
adityapb Jan 31, 2025
a491c87
Add some debug prints
adityapb Jan 31, 2025
3647848
More comments
adityapb Jan 31, 2025
b87e0c0
Add more debugs
adityapb Jan 31, 2025
9384ceb
Add more debug
adityapb Jan 31, 2025
754f7ef
Add a debug abort
adityapb Jan 31, 2025
9ca01ba
Try fixes
adityapb Jan 31, 2025
7bfd609
Merge branch 'main' into shrinkexpand-fix
adityapb Jan 31, 2025
5db1889
Fix some shrink expand issues
adityapb Feb 20, 2025
2aefeb5
Fix more issues
adityapb Feb 20, 2025
b95f511
Add debugging prints
adityapb Feb 21, 2025
b1ea298
Temporary hack to fix kubernetes issue
adityapb Feb 21, 2025
7ba8fcf
Fix avail vector bug in load balancing
adityapb Feb 21, 2025
24556fe
Write hostfile
adityapb Feb 22, 2025
04cb932
Fix race conditions with rescale calls
adityapb Feb 23, 2025
766f60f
Add new realloc handler
adityapb Feb 23, 2025
6fe9359
Change example
adityapb Feb 25, 2025
5c32ef6
Add greedy central LB
adityapb Feb 26, 2025
8fe831a
Add print for restore time
adityapb Mar 7, 2025
2361d02
More restore time after broadcast
adityapb Mar 10, 2025
1de5741
Spot instances support
adityapb Mar 23, 2025
25d21ab
Update client code
adityapb Apr 2, 2025
3d516bb
Fix realloc msg buffering
adityapb Apr 4, 2025
e099e9b
Shrink PEs from middle
adityapb Apr 8, 2025
aa77fbe
Merge branch 'recovery' into spot
adityapb Apr 8, 2025
0247d3d
Fix buffered realloc
adityapb Apr 8, 2025
3c0d76b
Debug
adityapb Apr 9, 2025
7b366a9
More debug
adityapb Apr 9, 2025
d3d9623
Fix pe numbering
adityapb Apr 17, 2025
1642f2a
Fix charmrun to correctly handle shrink expand
adityapb Apr 19, 2025
8c66d14
Disable load balancing unless rescaling
adityapb Apr 20, 2025
ff8ab03
Dummy load balancing
adityapb Apr 20, 2025
ab5e8ad
Add GPU PUP functions
adityapb Jul 21, 2025
eac6082
Working memory management daemon
adityapb Jul 23, 2025
c528824
Attempt at getting a load balancing example with GPUs
adityapb Jul 24, 2025
01d20aa
Fix conflicts
adityapb Jul 24, 2025
31d32e3
Merge branch 'gpu-pup' into gpu-shrinkexpand
adityapb Jul 24, 2025
70d82e4
Fix memory daemon and add shrink expand support
adityapb Jul 29, 2025
c22d42a
More fixes for shrinkexpand
adityapb Jul 29, 2025
04928d2
More fixes
adityapb Jul 30, 2025
12a9e31
PUP fix
adityapb Jul 31, 2025
edd0f86
Fix pup call
adityapb Jul 31, 2025
b5221a8
Add shrink expand enabling checks
adityapb Aug 1, 2025
6e5af5a
Add shrink-expand flags
adityapb Aug 3, 2025
be61657
More flags
adityapb Aug 3, 2025
df5d743
More flags
adityapb Aug 3, 2025
a54e31c
More flags
adityapb Aug 3, 2025
c81ac9b
Change cpu affinity of child process
adityapb Aug 4, 2025
9591b7b
Fix cpu affinity
adityapb Aug 4, 2025
4f25da8
Fix cpu affinity
adityapb Aug 4, 2025
8943a6c
Add print for CPU affinity
adityapb Aug 4, 2025
6b5ba85
UCX shrink expand support
adityapb Aug 5, 2025
773c74c
Fix build
adityapb Aug 5, 2025
77edc4b
Add check flags
adityapb Aug 5, 2025
ea19191
More fixes
adityapb Aug 5, 2025
fb1df60
More fixes
adityapb Aug 5, 2025
409a7a2
More fixes
adityapb Aug 5, 2025
bc8e9f0
More fixes
adityapb Aug 5, 2025
c72274b
Attempt build fix
adityapb Aug 5, 2025
6e986dc
Fix args
adityapb Aug 5, 2025
6b76e46
Fix args
adityapb Aug 5, 2025
137f85e
Fix args
adityapb Aug 5, 2025
e1cc3b7
Try fixing hang
adityapb Aug 5, 2025
76cddb9
Try fixing hang
adityapb Aug 5, 2025
e61889c
Try fixing hang
adityapb Aug 5, 2025
91dc438
Add nodelist to ucx charmrun
adityapb Aug 6, 2025
7f84f67
Update charmrun
adityapb Aug 6, 2025
e6ffb48
Pass nodelist as charm argument
adityapb Aug 6, 2025
bbf8ec7
Change exitcode for restart
adityapb Aug 6, 2025
16f2e4b
Write pes to file
adityapb Aug 6, 2025
dc34ca0
Write pes to file fix
adityapb Aug 6, 2025
6b73a15
Add charmrun elastic
adityapb Aug 6, 2025
75aeb23
Fix error
adityapb Aug 6, 2025
8892c9f
Fix macro
adityapb Aug 6, 2025
8ab97d9
Add process for memory daemon
adityapb Aug 7, 2025
b54ecd5
Fix filepath
adityapb Aug 7, 2025
67218b9
Fix filepath
adityapb Aug 7, 2025
046ff52
Fix hapi daemon
adityapb Aug 7, 2025
5e4b08d
Fix alloc id
adityapb Aug 7, 2025
f8fc2e1
Add dependency
adityapb Aug 7, 2025
cdfa062
Try fixing cmake
adityapb Aug 7, 2025
53209f7
Try fixing memory daemon again
adityapb Aug 7, 2025
dc64e06
Update cmake
adityapb Aug 7, 2025
dfb6850
Update scripts
adityapb Aug 7, 2025
96bd538
Update cmake dependencies
adityapb Aug 7, 2025
35edb6d
Fix memory deaemon fifo creation
adityapb Aug 7, 2025
e78ef10
Fix command in charmrun_hapi
adityapb Aug 7, 2025
c38f5b1
Skip ready check on restart
adityapb Aug 7, 2025
32981b1
Fix daemon start
adityapb Aug 7, 2025
bf6d360
Fix daemon start
adityapb Aug 7, 2025
d6c8939
Fix run script
adityapb Aug 7, 2025
2af9278
Fix run script
adityapb Aug 7, 2025
ffea7d7
Checkpoint and restore from the daemon
adityapb Aug 9, 2025
601674b
Fix build error
adityapb Aug 9, 2025
82b4ea3
Fix fromDisk
adityapb Aug 9, 2025
35eaad9
Fix migration segfault
adityapb Aug 9, 2025
b74d716
Fix get command
adityapb Aug 9, 2025
2e31098
Fix CKPT size
adityapb Aug 9, 2025
d7681e0
Add debugging print
adityapb Aug 10, 2025
abe3d37
Flush print
adityapb Aug 10, 2025
58b92f4
More debugging
adityapb Aug 10, 2025
22ca549
More debugging 2
adityapb Aug 10, 2025
b5fe788
Error checking
adityapb Aug 10, 2025
3e62845
Error checking
adityapb Aug 10, 2025
f3d76a8
Try fixing ipc_handle read
adityapb Aug 10, 2025
d01230e
Try fixing ipc_handle read 2
adityapb Aug 10, 2025
dd713fb
Try fixing ipc_handle read 3
adityapb Aug 10, 2025
6627234
Try fixing ipc_handle read 4
adityapb Aug 10, 2025
5d9fe03
Try fixing ipc_handle read 5
adityapb Aug 10, 2025
c18ed99
Try fixing ipc_handle read 5
adityapb Aug 10, 2025
58d69da
Remove print
adityapb Aug 10, 2025
4dba891
Add MPI shrinkexpand support
adityapb Aug 14, 2025
f051de1
More changes to MPI shrink expand
adityapb Aug 16, 2025
d5a8427
Fix rescale call
adityapb Aug 18, 2025
62fb5f0
Add startup example
adityapb Aug 19, 2025
bad1018
Fix memory leak on GPU migration
adityapb Sep 2, 2025
5be6030
Merge branch 'shrinkexpand-mpi' into shrinkexpand-ucx-copy
adityapb Sep 2, 2025
5949535
Fix shrink expand with MPI
adityapb Sep 3, 2025
6be5606
Shrink expand using UCX
adityapb Sep 5, 2025
212fd71
Shrink expand using UCX fix
adityapb Sep 5, 2025
c6c524d
Shrink expand using UCX fix 2
adityapb Sep 5, 2025
69e57d7
Fix charmrun_elastic
adityapb Sep 5, 2025
80a1e36
Update shrinkexpand example README
adityapb Sep 8, 2025
a70b055
Handle apple
adityapb Sep 9, 2025
4d60184
Handle apple 2
adityapb Sep 9, 2025
a203508
Handle apple mpi
adityapb Sep 9, 2025
7683bc7
Add GreedyRefineCentralLB
adityapb Sep 9, 2025
733fca2
add shrink/expand control flow to treelb WIP
mayantaylor Sep 9, 2025
09a40eb
Merge branch 'shrinkexpand-ucx-copy' of github.com:charmplusplus/char…
mayantaylor Sep 9, 2025
2cbce5c
Add GreedyRefineCentralLB
adityapb Sep 9, 2025
734f778
add smp warning/abort
mayantaylor Sep 9, 2025
0080d1f
remove unecessary extern vars from treelb
mayantaylor Sep 9, 2025
887241c
treelb initial version: working with shrink / expand separately
mayantaylor Sep 9, 2025
7f08dcb
Try fixing PMIx
adityapb Sep 10, 2025
84d4420
Merge branch 'shrinkexpand-ucx-copy' of https://github.com/charmplusp…
adityapb Sep 10, 2025
66c4a0d
Call LB at every LB step and print processor speeds
adityapb Sep 10, 2025
95e54c1
Add OSU bw benchmark
adityapb Sep 11, 2025
dcb4f71
Add OSU latency benchmark
adityapb Sep 11, 2025
f309aee
Remove binary
adityapb Sep 11, 2025
5dc268e
Try fixing GreedyRefine with rateAware
adityapb Sep 11, 2025
127a3d9
supporting PESpeed in greedy and refine
mayantaylor Sep 11, 2025
20d5ccb
Merge branch 'shrinkexpand-ucx-copy' of github.com:charmplusplus/char…
mayantaylor Sep 11, 2025
b6895a0
bugfix
mayantaylor Sep 11, 2025
e7abc3d
Add lbTime option
adityapb Sep 13, 2025
5a55299
Fix buffered realloc signal
adityapb Sep 13, 2025
655c41b
HAPI related fixes
adityapb Sep 15, 2025
03c2564
Update charmrun_hapi
adityapb Sep 16, 2025
1273e6b
Check for realloc after stats gathering
adityapb Sep 17, 2025
487dc2a
adding pups for root and PE levels
mayantaylor Sep 18, 2025
0d891fd
wip, pupping statsmsg (segfaults on new proc but works on pe0)
mayantaylor Sep 18, 2025
c60cb0e
only root PE has a second level (fixes segfault, but hangs)
mayantaylor Sep 18, 2025
d1314f4
tree levels have to update numPEs
mayantaylor Sep 18, 2025
4a3d239
correct pupping for Levels, WIP
mayantaylor Sep 19, 2025
d06f5e0
temporary avail_vecotr fix and pup cleanup
mayantaylor Sep 20, 2025
388bbff
works in most cases except expanding from #pes > 1
mayantaylor Sep 20, 2025
ea0b9a3
pup primarily works except in multiple rounds (need myObjs pupped sep…
mayantaylor Sep 21, 2025
75c7aeb
cleanup
mayantaylor Sep 23, 2025
ccd23b1
Change group checkpoint restore logic
adityapb Sep 23, 2025
e7319ae
cleanup unused function
mayantaylor Sep 23, 2025
64b0f3a
cleanup pup to prep for new expand
mayantaylor Sep 23, 2025
0c07779
cleanup awaitingLB from pup
mayantaylor Sep 23, 2025
92ae5ec
bugfix: npes logic and don't need to realloc availvector
mayantaylor Sep 23, 2025
ce12540
updating pup/migrate constructor logic (works for shrink)
mayantaylor Sep 23, 2025
b6545df
Call migration constructor on new PEs after expand
adityapb Sep 23, 2025
5512704
update stats_msg npes logic
mayantaylor Sep 23, 2025
cf77987
initial commit for GPU load tracking
mayantaylor Sep 23, 2025
b15b503
prelim support for obj gpu timing
mayantaylor Sep 24, 2025
49becbd
add hapi gpu timing with CMK_LBDB_ON
mayantaylor Sep 24, 2025
4e8ca66
print gpu and cpu times in stencil
mayantaylor Sep 24, 2025
b678233
cleanup obj constructor
mayantaylor Sep 24, 2025
652b121
bugfix: increment dont reset gpu load
mayantaylor Sep 24, 2025
a462605
Change the checkpoint restore logic to call pup on new PEs
adityapb Sep 25, 2025
1f4c78b
wrapping gputime and adding to stencil
mayantaylor Sep 25, 2025
15fbbf2
gputime -> data.gputime
mayantaylor Sep 25, 2025
485525c
shrink/expand working alone
mayantaylor Sep 25, 2025
85b7f90
bugfix: obj load set and get for hapi
mayantaylor Sep 25, 2025
f16c442
support for bg_gputime, but rn it is 0
mayantaylor Sep 25, 2025
2fb776e
Merge branch 'maya/gpu-aware-lb' into shrinkexpand-ucx-copy
mayantaylor Sep 25, 2025
4e85792
cleanup print statements + debug
mayantaylor Sep 25, 2025
cbf736b
Merge branch 'maya/gpu-aware-lb' into shrinkexpand-ucx-copy
mayantaylor Sep 25, 2025
78bddee
stats_msgs fixes WIP: epanding from 1pe works
mayantaylor Sep 25, 2025
32ded7f
using just one stats message at root: shrink and expand both working …
mayantaylor Sep 26, 2025
3a087cb
adding proc speed collection for new pes
mayantaylor Sep 26, 2025
ebb6269
PELevel doesn't need to pup statsmsg because its never used after sta…
mayantaylor Sep 26, 2025
77dd0f9
Remove stray prints
adityapb Sep 26, 2025
41c11e7
Fix GPU launch script
adityapb Sep 29, 2025
3f44ffb
Add +shrinkexpand arg to restart command
adityapb Sep 29, 2025
dd8d545
Fix call to hapi_memory_daemon
adityapb Sep 29, 2025
c6b2e2b
Fix freeing migrate GPU msg
adityapb Sep 30, 2025
da73d4a
Sort objs for GreedyCentralLB
adityapb Sep 30, 2025
2141b6f
Add debug prints
adityapb Sep 30, 2025
9ee7725
Fix GPU chare migration
adityapb Sep 30, 2025
c9a58e9
Remove prints
adityapb Sep 30, 2025
200829d
Remove unnecessary prints
adityapb Oct 1, 2025
ad79806
Reduce LB and print frequency
adityapb Oct 1, 2025
aa78931
cuda guard around sendGPUMsg
mayantaylor Oct 2, 2025
1015dac
adding gpu speed calculation
mayantaylor Oct 2, 2025
76226f1
CMK_CUDA build bugfix
mayantaylor Oct 5, 2025
9c0e57d
bugfix for shrink (shrink is working)
mayantaylor Oct 5, 2025
592fc5e
shrink and expand working!
mayantaylor Oct 6, 2025
c2b0314
collect speeds correctly
mayantaylor Oct 6, 2025
f6eacdf
Fix clock rate for CUDA 13+
adityapb Oct 6, 2025
bf7150d
Remove object migration prints
adityapb Oct 6, 2025
b30e687
Add lb to jacobi2d gpudirect example and debug print nobjs per PE
adityapb Oct 7, 2025
8e2cb2b
debug print nobjs per PE
adityapb Oct 7, 2025
58582b7
Fix jacobi example
adityapb Oct 7, 2025
1d56cfe
Add load imbalanced jacobi
adityapb Oct 8, 2025
5680f48
Merge branch 'shrinkexpand-ucx-copy' of https://github.com/charmplusp…
adityapb Oct 8, 2025
b2a7e95
Small fixes
adityapb Oct 8, 2025
3738be3
Change load iters
adityapb Oct 9, 2025
142164c
Update jacobi load imbaalnce
adityapb Oct 9, 2025
a1a7e83
Add nccl test
adityapb Oct 30, 2025
38e51a6
Minor fix
adityapb Nov 25, 2025
4c9b86f
working hapi port
Dec 12, 2025
66b8e92
patches for bugs
Dec 24, 2025
82e168d
mpi for rdma + ipc for multiple processes per gpu case
Jan 8, 2026
f2df1c6
fix qd issue
Jan 9, 2026
0cba429
fix memcpy det_pe seet
Sh0g0-1758 Jan 10, 2026
5ecffbb
a couple patches
Sh0g0-1758 Jan 14, 2026
dbc3815
mpi smp for chareArray and group
Sh0g0-1758 Jan 15, 2026
60d6ebf
fix barrier issues for smp
Sh0g0-1758 Jan 15, 2026
79ccdda
isend/irecv for mpi smp
Sh0g0-1758 Jan 17, 2026
a68d1ef
mpi rget and cached access windows
Sh0g0-1758 Jan 21, 2026
d5938df
richer prints
Sh0g0-1758 Jan 26, 2026
cc10daa
Tile loop
adityapb Jan 30, 2026
e0da46e
Tile loop
adityapb Jan 30, 2026
33ebcd6
cleanup
Sh0g0-1758 Feb 2, 2026
e328447
Rget and polling for non-smp
Sh0g0-1758 Feb 4, 2026
ceb4ad1
Revert "Rget and polling for non-smp"
Sh0g0-1758 Feb 4, 2026
b0def3d
Reapply "Rget and polling for non-smp"
Sh0g0-1758 Feb 4, 2026
f048dbc
use MPI_Testsome to test a vector of rdma_requests at once
Sh0g0-1758 Feb 4, 2026
ec5ed5f
cleanup
Sh0g0-1758 Feb 18, 2026
de38ab2
Merge remote-tracking branch 'charm_local/hapi_portable' into merge_l…
Sh0g0-1758 Feb 19, 2026
2884ef8
merge error fixes: the AMPI wrappers are a tricky business
Sh0g0-1758 Feb 19, 2026
767616c
changes for GPU load balancing and jacobi2d example correctness
Sh0g0-1758 Mar 11, 2026
1960091
corrections to CMK_GLOBAL_LOCATION_UPDATE
Sh0g0-1758 Mar 18, 2026
1d3cbd4
move update location to also update for chares moving in/out of PE
Sh0g0-1758 Mar 21, 2026
47f74e3
change back locMgrGid.idx change
Sh0g0-1758 Mar 21, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
114 changes: 96 additions & 18 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ option(BUILD_SHARED "Build Charm++ dynamic libraries" OFF)

# Other options
option(BUILD_CUDA "Build with CUDA support" OFF)
option(BUILD_HIP "Build with HIP support" OFF)
option(PXSHM "Build with PXSHM" OFF)

# LRTS PMI options
Expand Down Expand Up @@ -496,7 +497,7 @@ if(EXISTS ${CMAKE_SOURCE_DIR}/src/arch/${NETWORK}/gdir_link)
file(STRINGS src/arch/${NETWORK}/gdir_link GDIR)
elseif(${NETWORK} MATCHES "gni-")
set(GDIR "gni")
elseif(${NETWORK} MATCHES "mpi-cray")
elseif(${NETWORK} MATCHES "`mpi`-cray")
set(GDIR "mpi")
elseif(${NETWORK} MATCHES "ofi-cray")
set(GDIR "ofi")
Expand All @@ -518,6 +519,9 @@ else()
set(CMK_BUILD_CHARMRUN 1)
endif()

set(CHARMRUN_ELASTIC_DIR src/arch/common)
set(CHARMRUN_HAPI_DIR src/arch/common)

include(cmake/detect-features.cmake)
include(cmake/ci-files.cmake)

Expand Down Expand Up @@ -645,6 +649,8 @@ configure_file(src/arch/common/cc-msvc.sh include/ COPYONLY)
configure_file(src/arch/common/conv-mach-craype.sh include/ COPYONLY)
configure_file(src/arch/common/conv-mach-cuda.sh include/ COPYONLY)
configure_file(src/arch/common/conv-mach-cuda.h include/ COPYONLY)
configure_file(src/arch/common/conv-mach-hip.sh include/ COPYONLY)
configure_file(src/arch/common/conv-mach-hip.h include/ COPYONLY)
configure_file(src/arch/common/conv-mach-darwin.sh include/ COPYONLY)
configure_file(src/arch/common/conv-mach-flang.h include/ COPYONLY)
configure_file(src/arch/common/conv-mach-flang.sh include/ COPYONLY)
Expand Down Expand Up @@ -673,29 +679,90 @@ configure_file(src/arch/common/conv-mach-tsan.h include/ COPYONLY)
configure_file(src/arch/common/conv-mach-tsan.sh include/ COPYONLY)
configure_file(src/scripts/conv-config.sh include/ COPYONLY)
configure_file(src/arch/${VDIR}/conv-mach.sh include/ COPYONLY)
configure_file(src/util/ckrescale.h include/ COPYONLY)

add_library(ckrescale src/util/ckrescale.C)

set(CUDA_DIR "")
if(BUILD_CUDA)
set(HIP_DIR "")
if(BUILD_CUDA OR BUILD_HIP)

file(GLOB_RECURSE hybridAPI-h-sources ${CMAKE_SOURCE_DIR}/src/arch/cuda/*.h)
file(GLOB_RECURSE hybridAPI-cxx-sources ${CMAKE_SOURCE_DIR}/src/arch/cuda/*.cpp)
foreach(file ${hybridAPI-h-sources})
configure_file(${file} include/ COPYONLY)
endforeach()

if(CMAKE_VERSION VERSION_GREATER 3.17 OR CMAKE_VERSION VERSION_EQUAL 3.17)
find_package(CUDAToolkit REQUIRED)
set(CMAKE_CUDA_COMPILER "${CUDAToolkit_NVCC_EXECUTABLE}")
enable_language(CUDA)
set(CUDA_DIR "${CUDAToolkit_TARGET_DIR}")
else()
find_package(CUDA REQUIRED)
set(CUDA_DIR "${CUDA_TOOLKIT_ROOT_DIR}")
if (BUILD_CUDA)
if(CMAKE_VERSION VERSION_GREATER 3.17 OR CMAKE_VERSION VERSION_EQUAL 3.17)
find_package(CUDAToolkit REQUIRED)
set(CMAKE_CUDA_COMPILER "${CUDAToolkit_NVCC_EXECUTABLE}")
enable_language(CUDA)
set(CUDA_DIR "${CUDAToolkit_TARGET_DIR}")
else()
find_package(CUDA REQUIRED)
set(CUDA_DIR "${CUDA_TOOLKIT_ROOT_DIR}")
endif()
add_library(hybridapi ${hybridAPI-cxx-sources} $<TARGET_OBJECTS:ckrescale> $<TARGET_OBJECTS:converse>)

if(TRACING)
target_compile_definitions(hybridapi PRIVATE HAPI_TRACE)
endif()
endif()

if (BUILD_HIP)
add_compile_definitions(__HIP_PLATFORM_AMD__)
# Modern ROCm/HIP detection
if(NOT DEFINED ROCM_PATH)
if(NOT DEFINED ENV{ROCM_PATH})
set(ROCM_PATH "/opt/rocm" CACHE PATH "Path to ROCm installation")
else()
set(ROCM_PATH $ENV{ROCM_PATH} CACHE PATH "Path to ROCm installation")
endif()
endif()

# Find hipcc wrapper for reference
find_program(HIP_HIPCC_EXECUTABLE
NAMES hipcc
PATHS "${ROCM_PATH}/bin" "${ROCM_PATH}/hip/bin"
NO_DEFAULT_PATH
)

if(NOT HIP_HIPCC_EXECUTABLE)
message(FATAL_ERROR "Could not find hipcc. Please set ROCM_PATH to your ROCm installation directory.")
endif()

# Find the actual clang compiler used by ROCm (required by CMake)
find_program(CMAKE_HIP_COMPILER
NAMES clang++
PATHS "${ROCM_PATH}/llvm/bin" "${ROCM_PATH}/bin"
NO_DEFAULT_PATH
)

if(NOT CMAKE_HIP_COMPILER)
message(FATAL_ERROR "Could not find ROCm clang++ compiler in ${ROCM_PATH}")
endif()

set(HIP_DIR "${ROCM_PATH}")
set(CMAKE_HIP_ARCHITECTURES "gfx803;gfx900;gfx906;gfx908;gfx90a;gfx1030;gfx1100" CACHE STRING "HIP architectures")

# Enable HIP language support
enable_language(HIP)

add_library(hybridapi ${hybridAPI-cxx-sources})
target_include_directories(hybridapi PRIVATE "${ROCM_PATH}/include")

if(TRACING)
target_compile_definitions(hybridapi PRIVATE HAPI_TRACE)
endif()
endif()
add_library(cudahybridapi ${hybridAPI-cxx-sources})
if(TRACING)
target_compile_definitions(cudahybridapi PRIVATE HAPI_TRACE)

# hapi_memory_daemon - standalone executable for shrink/expand GPU memory management
if(BUILD_CUDA)
add_executable(hapi_memory_daemon src/arch/cuda/hybridAPI/hapi_memory_daemon.cpp)
add_dependencies(hapi_memory_daemon hybridapi ck converse ckqt moduleNDMeshStreamer ckmain memory-default threads-default ldb-rand modulecompletion conv-static)
endif()

endif()

if(EXISTS ${CMAKE_SOURCE_DIR}/src/arch/${VDIR}/conv-mach-cxi.sh)
Expand Down Expand Up @@ -902,6 +969,12 @@ if(${CMK_BUILD_CHARMRUN})
add_dependencies(charmrun create_symlinks)
else()
configure_file(${CHARMRUN_DIR}/charmrun ${CMAKE_BINARY_DIR}/bin COPYONLY)
if(EXISTS ${CMAKE_SOURCE_DIR}/${CHARMRUN_ELASTIC_DIR}/charmrun_elastic)
configure_file(${CHARMRUN_ELASTIC_DIR}/charmrun_elastic ${CMAKE_BINARY_DIR}/bin COPYONLY)
endif()
if(EXISTS ${CMAKE_SOURCE_DIR}/${CHARMRUN_HAPI_DIR}/charmrun_hapi)
configure_file(${CHARMRUN_HAPI_DIR}/charmrun_hapi ${CMAKE_BINARY_DIR}/bin COPYONLY)
endif()
endif()
configure_file(src/scripts/testrun bin/ COPYONLY)

Expand Down Expand Up @@ -988,7 +1061,11 @@ if(${TARGET} STREQUAL "charm4py")
endif()

if (${BUILD_CUDA})
target_link_libraries(charm cudart cudahybridapi)
target_link_libraries(charm cudart hybridapi)
endif()

if (${BUILD_HIP})
target_link_libraries(charm hiprtc hybridapi)
endif()

if(${TRACING})
Expand All @@ -1005,7 +1082,7 @@ else()
# Check that we are able to build and link an executable
add_executable(ckhello ${CMAKE_SOURCE_DIR}/tests/charm++/simplearrayhello/hello.C)
add_dependencies(ckhello ck ldb-none memory-default threads-default conv-static
converse ckmain ckqt
converse ckmain ckqt ckrescale
moduleNDMeshStreamer modulecompletion)
endif()

Expand Down Expand Up @@ -1052,7 +1129,7 @@ foreach(l BUILDOPTS CMK_AMPI_WITH_ROMIO CMK_BUILD_PYTHON CMK_CAN_LINK_FORTRAN
CXX_NO_AS_NEEDED LDXX_WHOLE_ARCHIVE_PRE LDXX_WHOLE_ARCHIVE_POST
CMK_MACOSX CMK_POST_EXE CMK_SHARED_SUF CMK_USER_SUFFIX OPTS_LD
CMK_COMPILER_KNOWS_FVISIBILITY CMK_LINKER_KNOWS_UNDEFINED
CMK_SUPPORTS_MEMORY_ISOMALLOC CUDA_DIR CMK_USER_DISABLED_TLS CMK_CXI)
CMK_SUPPORTS_MEMORY_ISOMALLOC CUDA_DIR HIP_DIR CMK_USER_DISABLED_TLS CMK_CXI)
file(APPEND ${optfile_sh} "${l}=\"${${l}}\"\n" )
endforeach(l)

Expand Down Expand Up @@ -1089,7 +1166,7 @@ endif()
set(optfile_mak ${CMAKE_BINARY_DIR}/include/conv-mach-opt.mak)

file(WRITE ${optfile_mak} "# Build-time options header for Makefiles, automatically generated by cmake.\n")
foreach(l CUDA_DIR BUILD_CUDA CMK_AMPI_WITH_ROMIO CMK_MACOSX CMK_BUILD_PYTHON
foreach(l CUDA_DIR HIP_DIR BUILD_CUDA BUILD_HIP CMK_AMPI_WITH_ROMIO CMK_MACOSX CMK_BUILD_PYTHON
CMK_CHARMDEBUG CMK_COMPILER CMK_GDIR CMK_HAS_MALLOC_HOOK CMK_HAS_MMAP CMK_LIBJPEG
CMK_LUSTREAPI CMK_MULTICORE CMK_NO_BUILD_SHARED CMK_NO_PARTITIONS CMK_SHARED_SUF
CMK_SMP CMK_SUPPORTS_FSGLOBALS CMK_SUPPORTS_PIPGLOBALS CMK_SUPPORTS_PIEGLOBALS
Expand All @@ -1102,7 +1179,8 @@ endforeach(l)

# Add options
set(CUDA ${BUILD_CUDA}) # need CUDA to match conv-mach file name
foreach(opt SMP OMP TCP PTHREADS SYNCFT PXSHM PERSISTENT OOC CUDA PAPI CXI)
set(HIP ${BUILD_HIP}) # need HIP to match conv-mach file name
foreach(opt SMP OMP TCP PTHREADS SYNCFT PXSHM PERSISTENT OOC CUDA HIP PAPI CXI)
if(${opt})
string(TOLOWER ${opt} optl)
file(APPEND ${optfile_sh} ". ${CMAKE_BINARY_DIR}/include/conv-mach-${optl}.sh\n")
Expand Down
25 changes: 25 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
FROM mpioperator/openmpi

RUN apt update && apt install -y build-essential zlib1g-dev ca-certificates cmake git

RUN apt update \
&& apt install -y --no-install-recommends \
g++ \
gfortran \
libopenmpi-dev \
&& rm -rf /var/lib/apt/lists/*

#RUN git clone https://github.com/charmplusplus/charm.git
RUN mkdir /home/mpiuser/charm
COPY . /home/mpiuser/charm
RUN cd charm && git checkout shrinkexpand-mpi && ./build charm++ mpi-linux-x86_64 --enable-shrinkexpand -j8 --force --with-production

RUN cd charm/examples/charm++/shrink_expand && make clean && make
RUN cd charm/examples/charm++/shrink_expand/jacobi2d-iter && make clean && make
RUN cd charm/examples/charm++/shrink_expand/startup && make clean && make
RUN mkdir /app
RUN cp charm/examples/charm++/shrink_expand/jacobi2d-iter/charmrun /app/
RUN cp charm/examples/charm++/shrink_expand/jacobi2d-iter/charmrun_elastic /app/
RUN cp charm/examples/charm++/shrink_expand/jacobi2d-iter/jacobi2d /app/
RUN cp charm/examples/charm++/shrink_expand/startup/startup /app/
RUN chmod 777 /app
5 changes: 5 additions & 0 deletions buildcmake
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ opt_ccs=0
opt_charmdebug=0
opt_controlpoint=0
opt_cuda=0
opt_hip=0
opt_destination=""
opt_disabletls=0
opt_install_prefix=""
Expand Down Expand Up @@ -176,6 +177,9 @@ function parse_platform_compilers() {
cuda)
opt_cuda=1
;;
hip)
opt_hip=1
;;
cxi)
opt_cxi=1
;;
Expand Down Expand Up @@ -648,6 +652,7 @@ CC=$opt_CC CXX=$opt_CXX FC=$opt_FC cmake "$my_srcdir" \
-DCHARMDEBUG="$opt_charmdebug" \
-DCONTROLPOINT="$opt_controlpoint" \
-DBUILD_CUDA="$opt_cuda" \
-DBUILD_HIP="$opt_hip" \
-DDISABLE_TLS="$opt_disabletls" \
-DDRONE_MODE="$opt_drone_mode" \
-DENABLE_FORTRAN=$opt_enable_fortran \
Expand Down
4 changes: 3 additions & 1 deletion cmake/converse.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -223,8 +223,10 @@ add_library(converse
${tmgr-h-sources}
${hwloc-objects}
${all-ci-outputs}

$<TARGET_OBJECTS:ckrescale>
)
add_dependencies(converse hwloc)
add_dependencies(converse hwloc ckrescale)

foreach(filename
${conv-core-h-sources}
Expand Down
3 changes: 2 additions & 1 deletion doc/charm++/manual.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9314,7 +9314,8 @@ This entry method should be invoked on the sender by wrapping the
source buffer with ``CkDeviceBuffer``, whose constructor takes a pointer
to the source buffer, a Charm++ callback to be invoked once the transfer
completes (optional), and a CUDA stream associated with the transfer
(which is only used internally in the CUDA memcpy and IPC based implementation and is also optional):
(which is only used internally in the CUDA memcpy and IPC based implementation and is also optional).
The user guarantees that the GPU buffer won't be modified until the callback is called:

.. code-block:: c++

Expand Down
9 changes: 7 additions & 2 deletions examples/ampi/Cjacobi3D/Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
-include ../../common.mk
-include ../../../include/conv-mach-opt.mak
CHARMBASE=../../../
CHARMC=../../../bin/ampicxx $(OPTS)
CHARMC=../../../netlrts-linux-x86_64/bin/ampicxx $(OPTS)
TOKENS=6

-include $(CHARMBASE)/include/conv-mach-opt.mak
Expand All @@ -12,6 +12,7 @@ AMPI_TARGETS := \
jacobi \
jacobi.pup \
jacobi-get \
jacobi.pie

ifeq (1,$(CMK_SUPPORTS_TLSGLOBALS))
AMPI_TARGETS += jacobi.tls
Expand Down Expand Up @@ -47,6 +48,10 @@ jacobi.tls: jacobi.C
$(CHARMC) -c -tlsglobals jacobi.C -o jacobi.tls.o
$(CHARMC) -o jacobi.tls jacobi.tls.o -tlsglobals

jacobi.pie: jacobi-pie.C
$(CHARMC) -c -pieglobals jacobi-pie.C -o jacobi.pie.o
$(CHARMC) -o jacobi.pie jacobi.pie.o -pieglobals

jacobi.rose: jacobi.C
$(CHARMC) -roseomptlsglobals -o jacobi.rose.o -c $<
$(CHARMC) -roseomptlsglobals -o $@ jacobi.rose.o
Expand Down Expand Up @@ -93,5 +98,5 @@ endif


clean:
rm -f *.o jacobi *~ moduleinit.C charmrun conv-host jacobi-cpp jacobi.iso jacobi-get jacobi.tls ampirun
rm -f *.o jacobi *~ moduleinit.C charmrun conv-host jacobi-cpp jacobi.iso jacobi-get jacobi.tls jacobi.pie ampirun
rm -rf 40 80 120
Loading
Loading