Skip to content

Conversation

@davidrohr
Copy link
Collaborator

@mconcas : Sorry this took so long :(.
This adds a GPUCommonChkErr.h header, providing my GPUChkErr macros.
Currently, they must be used upon a GPUReconstruction object.
The reason is that I need to dispatch to the backend to get the errorcodes, so I could not easily make it static.

Would this work for you?

I could probably create a static version, which would automatically detect which GPUReconstruction class is created first, and attach to that, and then dispatch to that.
That should in principle cover all use cases, except for if someone would use multiple backends in the same application, which is in principle supported.

@github-actions
Copy link
Contributor

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1
async-2024-PbPb-apass2
async-2023-PbPb-apass5

@alibuild
Copy link
Collaborator

Error while checking build/O2/fullCI_slc9 for 061b880 at 2025-03-14 09:52:

## sw/BUILD/O2-latest/log
ld.lld: error: undefined symbol: o2::gpu::GPUReconstructionHIPBackend::GPUChkErrStatic(long, char const*, int)
clang++: error: linker command failed with exit code 1 (use -v to see invocation)
ninja: build stopped: subcommand failed.

Full log here.

@davidrohr davidrohr changed the title GPU: Provide general GPUChkErr functionality also externally [WIP] GPU: Provide general GPUChkErr functionality also externally Mar 14, 2025
@davidrohr davidrohr changed the title [WIP] GPU: Provide general GPUChkErr functionality also externally GPU: Provide general GPUChkErr functionality also externally and several unrelated changes Mar 14, 2025
// Please #include "GPUReconstruction.h" in your code, if you use these 2!
#define GPUChkErr(x) GPUChkErrA(x, __FILE__, __LINE__, true)
#define GPUChkErrI(x) GPUChkErrA(x, __FILE__, __LINE__, false)
#define GPUChkErrS(x) o2::gpu::internal::GPUReconstructionChkErr(x, __FILE__, __LINE__, true)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mconcas : You should be able to use GPUChkErrS and GPUChkErrSI statically, if you link against the external provider.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, I'll try it!

@davidrohr
Copy link
Collaborator Author

@ktf @singiamtel : Could you check the MacOS builder that ran for this PR, this error seems pretty weird:


Call Stack (most recent call first):
  dependencies/CMakeLists.txt:13 (include)
  CMakeLists.txt:65 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Input BUILD_SIMULATION=
CMake Error at dependencies/O2Dependencies.cmake:161 (include):
  include could not find requested file:

@ktf
Copy link
Member

ktf commented Mar 15, 2025

Yes, it looks indeed quite strange. I assume the issue is:

[0/1] Re-running CMake...
CMake Error: The source directory "/Volumes/build/alice-ci-workdir/o2/sw/SOURCES/O2/14062/0" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.

I am looking into it.

@ktf
Copy link
Member

ktf commented Mar 15, 2025

I restarted the CI script on the builder and it seems to work. I suspect a recent change in upstream python might have left the SOURCES area in a bad state and while @singiamtel fixed the issue in alisw/alibuild#913 not all the builders where restarted to a clean slate.

@ktf
Copy link
Member

ktf commented Mar 15, 2025

@singiamtel I have the impression there is still something wrong with the mac builders.

@ktf
Copy link
Member

ktf commented Mar 15, 2025

Ok, after another cleanup of the build machine the error seems to be a genuine complaint about a missing:

dependencies/O2SimulationDependencies.cmake

any idea of how that might happen?

@davidrohr
Copy link
Collaborator Author

Now one of the Mac CIs passed O2, and then just stopped.
The other failed with CMake Error: The source directory "/Volumes/build/alice-ci-workdir/o2/sw/SOURCES/O2/14062/0" does not exist.

Do you think these issues are related to my PR? Or is this a general problem of the MacCI?

@davidrohr davidrohr merged commit 154ffd4 into AliceO2Group:dev Mar 17, 2025
9 of 12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants