-
Notifications
You must be signed in to change notification settings - Fork 244
Add Sirius #711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add Sirius #711
Conversation
|
@Yifei-yang7 Thanks - looking forward to merging this. Before doing so, I had a question. Basically all ClickBench submissions are tested on AWS EC2 instances. This submission uses lambda.ai. Running Sirius on EC2 would make it easier to compare against other databases (also, my EC2 account is covered by my company whereas lambda.ai asks me for my credit card). Would you consider submitting results for a EC2 instance? They provide various P* boxes with Nvidia GPUs. Perhaps the compilation script needs some changes as well. |
|
@rschu1ze Thanks for your quick response! |
|
@Yifei-yang7 For sure, however long/mid-term I think that AWS will offer GH200 instances as well. What about this: you test that script runs on EC2 as well (H100) and we submit both entries - this will be more future-proof going forward. |
|
@rschu1ze I just tried to launch an H100 instance on EC2 but kept getting "insufficient capacity". It seems to be much harder to reserve an instance than Lambda. |
|
@rschu1ze I managed to reserve a p5.4xlarge instance on EC2 and confirmed that the benchmark script runs correctly there. I’ve added the new results to this PR. Let me know if there’s anything else you’d like me to update. Thanks. |
|
|
|
@rschu1ze We need to use AMIs with cuda/nvcc already installed, for example, |
|
Okay, thanks, I'll retry. @Yifei-yang7 Would you add this info to a file sirius/README, so the next person who tries to reproduce the measurement will not make the same mistake? |
Now it is failing with another error ^^. Duckdb seems not built by the scripts. |
|
@rschu1ze Thanks for your suggestion! I added a README to incorporate this instruction. For the issue in your run, it seems something weird happed when installing cmake 3.30.4. |
|
@rschu1ze and I saw this in your log, are you using the same image of Btw I was launching instances from us-east-1/us-east2, unsure if this affects. |
|
Thanks @Yifei-yang7 . My Ubuntu-native CMake was too old, so benchmark.sh removed it and installed its own version ... and this mechanism somehow didn't work. I installed the latest and greatest CMake manually from cmake.org. Then the next error came up: It somehow doesn't find standard library headers. Any clue what happened here? Perhaps it is easier to provide pre-build binaries in the script instead of compiling from scratch? |
|
@rschu1ze Hey I guess somehow the c compiler in your machine is broken, which cannot perform the basic build, similar to why the cmake source cannot be built. I saw you are using while by default a fresh instance will use May I know if you did some config on the compiler to use, since from the log the compiler used seems to be different from directly running the script from a fresh instance? Would you like to try a fresh instance (like g6.4x) and just run the script? Thanks! And for the cmake version, by default the system installed cmake is too old, and the benchmark script contains automatically detecting and upgrading to the required version. So people don't need to manually install themselves. |
| @@ -0,0 +1,57 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My local measurements:
[0.027,0.001,0.001],
[1.644,0.002,0.001],
[3.918,0.003,0.002],
[3.466,0.002,0.002],
[3.033,0.005,0.005],
[6.675,0.008,0.008],
[1.299,0.002,0.001],
[1.662,0.003,0.002],
[5.261,0.092,0.093],
[8.575,0.098,0.097],
[5.723,0.015,0.015],
[7.033,0.015,0.015],
[6.066,0.071,0.069],
[9.897,0.193,0.191],
[7.185,0.075,0.074],
[3.572,0.012,0.011],
[9.687,0.124,0.123],
[9.711,0.024,0.024],
[15.172,0.166,0.164],
[3.317,0.002,0.001],
[36.475,0.041,0.040],
[48.522,0.012,0.012],
[71.297,0.031,0.032],
[220.898,0.065,0.066],
[10.418,0.008,0.008],
[..]
(I canceled the run a bit early because I forgot to run the test in tmux, it's bedtime for me, and I don't want to sleep next to my work laptop turned on).
Numbers look very close to yours 👍
sirius/results/p5.4xlarge.json
Outdated
| "cluster_size": 1, | ||
| "proprietary": "no", | ||
| "tuned": "no", | ||
| "tags": ["C++","GPU-accelerated","column-oriented","embedded"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three minor requests before I merge this PR:
In #714 and #715, I added a hardware tag. Would you please
- merge from upstream (so these two PRs become visible here),
- adjust both result json files (insert hardware tag = gpu),
- add a template.json file, see e.g. clickhouse/template.json for an example. This file is used by our internal automation to re-generate result files
Thanks.
(couldn't do the changes by myself as I don't have permissions to push to your ClickHouse fork repository:
/data/ClickBench (sirius >) $ g pu sirius-db sirius
remote: Permission to sirius-db/ClickBench.git denied to rschu1ze.
fatal: unable to access 'https://github.com/sirius-db/ClickBench.git/': The requested URL returned error: 403)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your help! I adjusted the PR as suggested.

This PR adds Sirius, a GPU-native SQL engine which can provide drop-in acceleration for DuckDB.
The structure and files are similar to DuckDB, and the changes are mainly results and benchmarking script. The script is tested on the lambda-GH200 instance.
Please let me know if you need any adjustments or additional artifacts. Thanks!