Conversation
|
Array API standard conformance tests for dpnp=0.18.0dev2=py312he4f9c94_48 ran successfully. |
|
View rendered docs @ https://intelpython.github.io/dpnp/index.html |
scripts/build_locally.py
Outdated
| if not arch: | ||
| raise ValueError("--arch is required when --target=hip") | ||
| cmake_args += [ | ||
| "-DDPNP_TARGET_HIP=ON", |
There was a problem hiding this comment.
For what do we need to define two variables? Can it be combined in a single one, like in dpctl: -DDPNP_TARGET_HIP={arch}?
There was a problem hiding this comment.
Additionally, --target=cuda is current dpnp approach, but:
- dpctl and dpnp should consider supporting targeting specific CUDA architectures
--target=hipmeans that there is no way to build simultaneously for HIP and CUDA (which is very, very much an edge case, but should be considered)
For these reasons, I think it is most sensible to move away from --target= universal approach to --target-cuda= and --target-hip= or something to that effect
There was a problem hiding this comment.
@ndgrigorian it is a great suggestion.
I have added support for --target-hip and I am going to add --target-cuda instead of --target in the next PR.
Thanks
| "Build DPNP with oneMKL Interfaces" | ||
| OFF | ||
| ) | ||
| set(HIP_TARGETS "" CACHE STRING "HIP architecture for target") |
There was a problem hiding this comment.
I assume there is no support for multiple values:
| set(HIP_TARGETS "" CACHE STRING "HIP architecture for target") | |
| set(HIP_TARGET "" CACHE STRING "HIP architecture for target") |
There was a problem hiding this comment.
At some point, it was clear in docs that only one architecture was supported at a time, but now it isn't as clear and should be tested
Also, there is new information in the extension guide
The compiler driver also offers alias targets for each target+architecture pair to make the command line shorter and easier to understand for humans. Thanks to the aliases, the -Xsycl-target-backend flags no longer need to be specified.
It shows that the command
icpx -fsycl -fsycl-targets=spir64_gen,amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \
-Xsycl-target-backend=spir64_gen '-device pvc' \
-Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \
-Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \
-o sycl-app sycl-app.cpp
is equivalent to
icpx -fsycl -fsycl-targets=intel_gpu_pvc,amd_gpu_gfx1030,nvidia_gpu_sm_80 \
-o sycl-app sycl-app.cpp
so maybe both dpctl and dpnp can simplify by removing the need for -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=[X] completely
list of aliases:
https://intel.github.io/llvm/UsersManual.html
There was a problem hiding this comment.
Aliases list seems to claim only one alias is supported at a time. So probably only one architecture at once is possible? That would be my guess
There was a problem hiding this comment.
I assume there is no support for multiple values:
I am using HIP_TARGETS instead of HIP_TARGET because oneMath requires HIP_TARGETS to be defined
https://github.com/uxlfoundation/oneMath/blob/4ad4dfb5db834117248ad5f8fbded5cfc1097005/CMakeLists.txt#L73
There was a problem hiding this comment.
Is it only declaration of cmake variable which doesn't impact oneMath, isn't that?
There was a problem hiding this comment.
Is it only declaration of cmake variable which doesn't impact oneMath, isn't that?
According to OneMath documentation HIP_TARGETS must be set for ROCm builds
There was a problem hiding this comment.
At some point, it was clear in docs that only one architecture was supported at a time, but now it isn't as clear and should be tested
Also, there is new information in the extension guide
The compiler driver also offers alias targets for each target+architecture pair to make the command line shorter and easier to understand for humans. Thanks to the aliases, the -Xsycl-target-backend flags no longer need to be specified.
It shows that the command
icpx -fsycl -fsycl-targets=spir64_gen,amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \ -Xsycl-target-backend=spir64_gen '-device pvc' \ -Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=gfx1030 \ -Xsycl-target-backend=nvptx64-nvidia-cuda --offload-arch=sm_80 \ -o sycl-app sycl-app.cppis equivalent to
icpx -fsycl -fsycl-targets=intel_gpu_pvc,amd_gpu_gfx1030,nvidia_gpu_sm_80 \ -o sycl-app sycl-app.cppso maybe both dpctl and dpnp can simplify by removing the need for
-Xsycl-target-backend=amdgcn-amd-amdhsa --offload-arch=[X]completelylist of aliases: https://intel.github.io/llvm/UsersManual.html
This is a great suggestion.
- The compiler supports more than one target
- Using aliases greatly simplifies the logic
In this PR I will implement using aliases for AMD,
In the next PR I will suggest an update for CUDA
antonwolfy
left a comment
There was a problem hiding this comment.
Thank you @vlad-perevezentsev, in overall LGTM
This PR updates `СMakeLists` files and `build_locally.py` to enable building dpnp for AMD targets. To build dpnp on AMD: ``` python scripts/build_locally.py --target-hip=gfx90a ``` To find the architecture, use ``` rocminfo | grep 'Name: *gfx.*' ``` fdf9ba7
This PR updates
СMakeListsfiles andbuild_locally.pyto enable building dpnp for AMD targets.To build dpnp on AMD:
To find the architecture, use