-
Notifications
You must be signed in to change notification settings - Fork 483
GPU stream implementation for ONNX runtime #14117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
davidrohr
merged 51 commits into
AliceO2Group:dev
from
ChSonnabend:onnx_gpu_implementation
Apr 20, 2025
Merged
Changes from all commits
Commits
Show all changes
51 commits
Select commit
Hold shift + click to select a range
84eac06
Initial set of bug.fixes and cosmetic changes
ChSonnabend 2191649
Please consider the following formatting changes
alibuild 5be779c
Merge pull request #18 from alibuild/alibot-cleanup-14069
ChSonnabend b742c50
Adjusting eval sizes. Makes code neater and avoids some calculations
ChSonnabend c0bc918
Merge branch 'dev' into gpu_clusterizer_bug_fixes
ChSonnabend 0c1cfb7
Adding separate functions. Now the host process only needs one instan…
ChSonnabend 83c004f
First version of CCDB implementation
ChSonnabend d767ed1
Working CCDB API calls (tested with test-ccdb)
ChSonnabend ad4b22b
Improve fetching, but have to pass settings by value, not const ref
ChSonnabend 81c646b
Using const ref and moving CCDB calls to host initialization
ChSonnabend 566ddb7
Simplifications and renaming
ChSonnabend a9c33b5
Please consider the following formatting changes
alibuild 0ed7d25
Merge pull request #19 from alibuild/alibot-cleanup-14069
ChSonnabend 9037ea6
First version of GPU stream implementation. Still needs testing.
ChSonnabend 64c19d5
Fixes
ChSonnabend 8a5bb69
Please consider the following formatting changes
alibuild e657928
Merge pull request #20 from alibuild/alibot-cleanup-14117
ChSonnabend 46fb1e1
Adding the lane variable. This PR will in any case conflict with #14069
ChSonnabend 70320c3
Compiles on EPNs. Need to add shadow processors next. But for this, I…
ChSonnabend 3174e39
Merge branch 'gpu_clusterizer_bug_fixes' into onnx_gpu_implementation
ChSonnabend 9d9267f
Adding shadow instance. Not sure if this correctly allocates GPU memo…
ChSonnabend 007a4a1
This runs, but will eventually fill up the VRAM. Need to include a me…
ChSonnabend 4ef35fc
Found the stream allocation issue. Now starting optimizations
ChSonnabend 4faaa4a
Improve readability and adapt for some comments
ChSonnabend 2801c2e
Fixing memory assignment issue. Reconstruction runs through with FP32…
ChSonnabend 1dcb1da
Major reworkings to add FP16 support
ChSonnabend 7da3793
Merge branch 'dev' into onnx_gpu_implementation
ChSonnabend 381955a
Bug-fixes
ChSonnabend 19b5bd5
Improved data filling speeds by factor 3
ChSonnabend 83d0257
Limiting threads for ONNX evaluation
ChSonnabend fff6dc3
Bug-fix for correct thread assignment and input data filling
ChSonnabend b437e38
Minor changes
ChSonnabend 710993a
Adding I** inference, potentally needed for CNN + FC inference
ChSonnabend 77c1691
CCDB fetching of NNs ported to GPUWorkflowSpec
ChSonnabend a985798
Adjusting CPU threads and ORT copmile definitions
ChSonnabend fb08f18
About 10x speed-up due to explicit io binding
ChSonnabend b1c88f0
Changes for synchronization and consistency. No performance loss.
ChSonnabend 32cab70
Please consider the following formatting changes
alibuild 5f741fc
Merge pull request #21 from alibuild/alibot-cleanup-14117
ChSonnabend 70907aa
Fixing warnings (errors due to size_t)
ChSonnabend e46cdfa
Fixing linker issues
ChSonnabend 37955fa
Merge branch 'dev' into onnx_gpu_implementation
ChSonnabend 4b0825a
Adding volatile memory allocation and MockedOrtAllocator. Removing pr…
ChSonnabend 497a9d4
Please consider the following formatting changes
alibuild aabddb7
Merge pull request #22 from alibuild/alibot-cleanup-14117
ChSonnabend cfdc15f
Merge dev + fixes
ChSonnabend a67b634
Circumvent "unused result" warning and build failure
ChSonnabend 938a1ed
Adjust for comments
ChSonnabend 7b07496
Please consider the following formatting changes
alibuild 4d3f54d
Merge pull request #23 from alibuild/alibot-cleanup-14117
ChSonnabend af89c9a
Fixing build flags
ChSonnabend File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of setting 0/1 definitions, I would set only the =1 definition, if the CMake variable is set.
Then, in the code further below you don't need
#if defined(FOO) && FOO=1, but you can simply use#ifdef FOO