Support finding insertion positions from &HashTable#725
Conversation
| pub unsafe fn insert_at_position(&mut self, position: VacantPosition, value: T) -> &mut T { | ||
| debug_assert!(self.len() < self.capacity()); |
There was a problem hiding this comment.
I don't think this is the sanity check we need, rather that index < num_buckets && !is_bucket_full(index).
We could also split function this into a safe version, perhaps -> Result<&mut T, T> giving the value back on error, and an unsafe unchecked version without any assertion. That safe version could have logic errors if the table was mutated in some way while keeping that index apparently valid -- but it would still be memory safe.
There was a problem hiding this comment.
Maybe the safe version also needs some check to make sure we don't exceed the load factor, but IIRC it's possible to do that even with no apparent capacity left if the index happens to be a DELETED tombstone.
There was a problem hiding this comment.
That safe version could have logic errors if the table was mutated in some way while keeping that index apparently valid -- but it would still be memory safe.
Is this correct? What would such a safe function call to perform the insertion? Currently both RawTable::insert_at_index and RawTable::insert_tagged_at_index stipulate that it's unsound to call them with an index that wasn't previously returned by find_or_find_insert_index or if any mutation of the table has occurred since.
Looking at their implementations, they ultimately make four unsafe calls: I think that three of those (RawTableInner::ctrl, RawTable::bucket and Bucket::write) might actually only be unsound if the index is out of range; the fourth unsafe call, to RawTableInner::record_item_insert_at, doesn't have any safety documentation—and I'm not sufficiently familiar with how the metadata is maintained to be certain there's no safety concern here. For example, it increments self.items; if there's an existing value in the bucket that's being overwritten, then surely that will result in an incorrect count?
Perhaps there's no soundness issues, and it's okay to relax the safety requirements of RawTable::insert_at_index and RawTable::insert_tagged_at_index to merely require that the index be in range (stipulating instead that passing incorrect indices may result in logic errors and/or panics)? They are only crate-visible, after all. Then it would indeed be pretty trivial to publicly expose a safe/checked version of this new HashTable::insert_at_position method.
There was a problem hiding this comment.
For example, it increments
self.items; if there's an existing value in the bucket that's being overwritten, then surely that will result in an incorrect count?
Checking is_bucket_full prevents that case.
There was a problem hiding this comment.
Anyway, I'd be interested if others have opinions before you change anything toward my suggestion.
3edcfbf to
6bf6e57
Compare
Adds to `HashTable` the following APIs: ```rust fn find_or_find_vacant_position(&self, hash, eq) -> Result<&T, Option<VacantPosition>> unsafe fn insert_at_position(&mut self, position, value) -> &mut T ``` This enables an insertion position to be located through an immutable `&HashTable` reference (such as through a read-lock of a `RwLock`), in order that it might subsequently be used to perform an insertion when an exclusive `&mut HashTable` reference becomes available (such as through a write-lock of the same `RwLock`). In so doing, users can avoid repeated hash probing when it is known that the table has not been mutated in the interim.
6bf6e57 to
bc09d05
Compare
|
I'm not necessarily against this API, but I did recently factor out the |
Adds to
HashTablethe following APIs:This enables an insertion position to be located through an immutable
&HashTablereference (such as through a read-lock of aRwLock), in order that it might subsequently be used to perform an insertion when an exclusive&mut HashTablereference becomes available (such as through a write-lock of the sameRwLock). In so doing, users can avoid repeated hash probing when it is known that the table has not been mutated in the interim.Closes #723 cc @cuviper