-
Notifications
You must be signed in to change notification settings - Fork 35
Make index blocksize flexible #248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi!
Instead I would propose to setup block size in a
|
rfsaliev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also read my standalone comment to the PR as well.
| #include <cstddef> | ||
|
|
||
| #include <svs/lib/exception.h> | ||
| #include <svs/lib/misc.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please avoid SVS internals dependencies in SVS Runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| public: | ||
| explicit IndexBlockSize(size_t blocksize_exp) { | ||
| if (blocksize_exp > kMaxBlockSizeExp) { | ||
| throw ANNEXCEPTION("Blocksize is too large!"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SVS runtime API should be noexcept
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the IndexBlockSize structure is removed
| return runtime_error_wrapper([&] { | ||
| svs::data::ConstSimpleDataView<float> data{x, n, impl_->dimensions()}; | ||
| std::span<const size_t> lbls(labels, n); | ||
| impl_->add(data, lbls, blocksize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please convert SVS Runtime API data types (IndexBlockSize) to SVS internal type (PowerOfTwo) here.
As it was made for data argument
| return slots; | ||
| } | ||
|
|
||
| lib::PowerOfTwo blocksize_bytes() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think it makes sense to modify SVS internals just to implement a query for SVS runtime API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
The getter now calls for parameter filed, not the SVS internals
|
Thanks for comments. I have modified the PR, according to your request:
|
rfsaliev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGFM
just few comments/suggestions
| ) noexcept { | ||
| using Impl = DynamicVamanaIndexLeanVecImpl; | ||
| *index = nullptr; | ||
| return runtime_error_wrapper([&] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to call DynamicVamanaIndex::check_params() here as it is done in DynamicVamanaIndex::build()?
| ) noexcept { | ||
| using Impl = DynamicVamanaIndexLeanVecImpl; | ||
| *index = nullptr; | ||
| return runtime_error_wrapper([&] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
| template <svs::threads::ThreadPool Pool> | ||
| static StorageType init( | ||
| const svs::data::ConstSimpleDataView<float>& data, | ||
| Pool& pool, | ||
| svs::lib::PowerOfTwo blocksize_bytes | ||
| ) { | ||
| auto parameters = svs::data::BlockingParameters{.blocksize_bytes = blocksize_bytes}; | ||
| typename StorageType::allocator_type alloc(parameters); | ||
| StorageType result(data.size(), data.dimensions(), alloc); | ||
| svs::threads::parallel_for( | ||
| pool, | ||
| svs::threads::StaticPartition(result.size()), | ||
| [&](auto is, auto SVS_UNUSED(tid)) { | ||
| for (auto i : is) { | ||
| result.set_datum(i, data.get_datum(i)); | ||
| } | ||
| } | ||
| ); | ||
| return result; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, would be better to modify the existing init() method rather than add an overload.
Is there any call to StorageFactory<SimpleDatasetType<ElementType>>::init() without blocksize_bytes argument?
|
@razdoburdin could you give an example how this is used in a comment or the PR description? |
This PR adds ability to set custom index blocksize
Reopening of #235