Skip to content

Improve speed of DictionaryGroupValues #22078

@Rich-T-kid

Description

@Rich-T-kid

Is your feature request related to a problem or challenge?

Currently the DictionaryGroupValues path is faster than GroupValuesRows, but there is still room for improvement. seen_elements stores the raw bytes of each element as a Vec within a Vec. The frequent allocations this causes are minor but do show up as CPU spend in intern(). The current collision handling also forces a copy: bytes are stored in both seen_elements and unique_dict_value_mapping.

Describe the solution you'd like

This can be resolved by storing intermediate bytes in a single contiguous buffer, then tracking offsets and lengths instead of raw bytes. We'd introduce a new field on the struct that holds the buffer, and seen_elements / unique_dict_value_mapping would only need to store an offset and length per entry. This would replace a potentially large byte copy with two i32s.

Describe alternatives you've considered

the alternative is to not change anything. benchmarks show that even with the current approach its faster than the default GroupValuesRow approach.

Additional context

see #21765

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions