-
Notifications
You must be signed in to change notification settings - Fork 483
TPC NN Clusterization: CCDB support + cosmetic changes #14069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TPC NN Clusterization: CCDB support + cosmetic changes #14069
Conversation
|
REQUEST FOR PRODUCTION RELEASES: This will add The following labels are available |
Please consider the following formatting changes to AliceO2Group#14069
…ce and one initialization
Please consider the following formatting changes to AliceO2Group#14069
… will merge AliceO2Group#14069 to have the changes in GPUChainTrackingClusterizer.
* Initial set of bug.fixes and cosmetic changes * Please consider the following formatting changes * Adjusting eval sizes. Makes code neater and avoids some calculations * Adding separate functions. Now the host process only needs one instance and one initialization * First version of CCDB implementation * Working CCDB API calls (tested with test-ccdb) * Improve fetching, but have to pass settings by value, not const ref * Using const ref and moving CCDB calls to host initialization * Simplifications and renaming * Please consider the following formatting changes * First version of GPU stream implementation. Still needs testing. * Fixes * Please consider the following formatting changes * Adding the lane variable. This PR will in any case conflict with #14069 * Compiles on EPNs. Need to add shadow processors next. But for this, I will merge #14069 to have the changes in GPUChainTrackingClusterizer. * Adding shadow instance. Not sure if this correctly allocates GPU memory using AllocateRegisteredMemory * This runs, but will eventually fill up the VRAM. Need to include a mem clean * Found the stream allocation issue. Now starting optimizations * Improve readability and adapt for some comments * Fixing memory assignment issue. Reconstruction runs through with FP32 networks * Major reworkings to add FP16 support * Bug-fixes * Improved data filling speeds by factor 3 * Limiting threads for ONNX evaluation * Bug-fix for correct thread assignment and input data filling * Minor changes * Adding I** inference, potentally needed for CNN + FC inference * CCDB fetching of NNs ported to GPUWorkflowSpec * Adjusting CPU threads and ORT copmile definitions * About 10x speed-up due to explicit io binding * Changes for synchronization and consistency. No performance loss. * Please consider the following formatting changes * Fixing warnings (errors due to size_t) * Fixing linker issues * Adding volatile memory allocation and MockedOrtAllocator. Removing print statements and time measurements * Please consider the following formatting changes * Circumvent "unused result" warning and build failure * Adjust for comments * Please consider the following formatting changes * Fixing build flags --------- Co-authored-by: ALICE Action Bot <alibuild@cern.ch>
No description provided.