Skip to content

Conversation

@ChSonnabend
Copy link
Collaborator

No description provided.

@github-actions
Copy link
Contributor

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1
async-2024-PbPb-apass2
async-2023-PbPb-apass5

@ChSonnabend ChSonnabend changed the title Bug fixes + cosmetic changes TPC NN Clusterization: CCDB support + cosmetic changes Mar 25, 2025
Please consider the following formatting changes to AliceO2Group#14069
ChSonnabend added a commit to ChSonnabend/AliceO2 that referenced this pull request Mar 29, 2025
… will merge AliceO2Group#14069 to have the changes in GPUChainTrackingClusterizer.
davidrohr pushed a commit that referenced this pull request Apr 20, 2025
* Initial set of bug.fixes and cosmetic changes

* Please consider the following formatting changes

* Adjusting eval sizes. Makes code neater and avoids some calculations

* Adding separate functions. Now the host process only needs one instance and one initialization

* First version of CCDB implementation

* Working CCDB API calls (tested with test-ccdb)

* Improve fetching, but have to pass settings by value, not const ref

* Using const ref and moving CCDB calls to host initialization

* Simplifications and renaming

* Please consider the following formatting changes

* First version of GPU stream implementation. Still needs testing.

* Fixes

* Please consider the following formatting changes

* Adding the lane variable. This PR will in any case conflict with #14069

* Compiles on EPNs. Need to add shadow processors next. But for this, I will merge #14069 to have the changes in GPUChainTrackingClusterizer.

* Adding shadow instance. Not sure if this correctly allocates GPU memory using AllocateRegisteredMemory

* This runs, but will eventually fill up the VRAM. Need to include a mem clean

* Found the stream allocation issue. Now starting optimizations

* Improve readability and adapt for some comments

* Fixing memory assignment issue. Reconstruction runs through with FP32 networks

* Major reworkings to add FP16 support

* Bug-fixes

* Improved data filling speeds by factor 3

* Limiting threads for ONNX evaluation

* Bug-fix for correct thread assignment and input data filling

* Minor changes

* Adding I** inference, potentally needed for CNN + FC inference

* CCDB fetching of NNs ported to GPUWorkflowSpec

* Adjusting CPU threads and ORT copmile definitions

* About 10x speed-up due to explicit io binding

* Changes for synchronization and consistency. No performance loss.

* Please consider the following formatting changes

* Fixing warnings (errors due to size_t)

* Fixing linker issues

* Adding volatile memory allocation and MockedOrtAllocator. Removing print statements and time measurements

* Please consider the following formatting changes

* Circumvent "unused result" warning and build failure

* Adjust for comments

* Please consider the following formatting changes

* Fixing build flags

---------

Co-authored-by: ALICE Action Bot <alibuild@cern.ch>
@ChSonnabend ChSonnabend deleted the gpu_clusterizer_bug_fixes branch July 19, 2025 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants