Allow to prefetch buckets

In [Apache DataFusion](https://github.com/apache/datafusion) we are using hashbrown extensively and I noticed in hash aggregation when the hash table (we use the raw API there) become very large we are memory bound so a natural solution would be to have some kind of prefetching (FYI we are using stable Rust), but there is no way to do that currently

We have something like this (simplified):

```rust
for (row, &target_hash) in batch_hashes.iter().enumerate() {
    let entry = self.map.entry(
        target_hash,
        // eq
        |(exist_hash, group_idx)| target_hash == *exist_hash && group_rows.row(row) == group_values.row(*group_idx),
        // hasher
        |(hash, _)|  *hash);

    let group_idx = match entry {
        // Existing group_index for this group value
        Entry::Occupied(o) => {
            let (_hash, group_idx) = o.get();  
            
            *group_idx
        }
        // Need to create new entry for the group
        Entry::Vacant(v) => {
            // Add new entry to aggr_state and save newly created index
            let group_idx = group_values.num_rows();
            group_values.push(group_rows.row(row));
            v.insert((target_hash, group_idx));

            group_idx
        }
    };
    groups.push(group_idx);
}
```

Allowing to prefetch from the map can help performance for our case.

Because we insert to the table on missing we would have to reserve the x amount of items beforehand for the prefetching to be valuable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow to prefetch buckets #677

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow to prefetch buckets #677

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions