Improve speed of DictionaryGroupValues

### Is your feature request related to a problem or challenge?

Currently the DictionaryGroupValues path is faster than GroupValuesRows, but there is still room for improvement. seen_elements stores the raw bytes of each element as a Vec<u8> within a Vec. The frequent allocations this causes are minor but do show up as CPU spend in intern(). The current collision handling also forces a copy: bytes are stored in both seen_elements and unique_dict_value_mapping.

### Describe the solution you'd like

This can be resolved by storing intermediate bytes in a single contiguous buffer, then tracking offsets and lengths instead of raw bytes. We'd introduce a new field on the struct that holds the buffer, and seen_elements / unique_dict_value_mapping would only need to store an offset and length per entry. This would replace a potentially large byte copy with two i32s.

### Describe alternatives you've considered

the alternative is to not change anything. benchmarks show that even with the current approach its faster than the default GroupValuesRow approach.

### Additional context

see #21765 

<img width="832" height="370" alt="Image" src="https://github.com/user-attachments/assets/c25c4f95-5aef-45a9-b6bc-01080e254ff9" />


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve speed of DictionaryGroupValues #22078

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Improve speed of DictionaryGroupValues #22078

Description

Is your feature request related to a problem or challenge?

Describe the solution you'd like

Describe alternatives you've considered

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions