Skip to content

feat[arrow-ord]: suppport REE comparisons#9621

Open
asubiotto wants to merge 1 commit intoapache:mainfrom
polarsignals:asubiotto/reecmp
Open

feat[arrow-ord]: suppport REE comparisons#9621
asubiotto wants to merge 1 commit intoapache:mainfrom
polarsignals:asubiotto/reecmp

Conversation

@asubiotto
Copy link
Copy Markdown
Contributor

@asubiotto asubiotto commented Mar 27, 2026

This commit implements native comparisons on REE-encoded arrays which are treated similarly to dictionary indirection.

This commit implements REE to scalar comparisons by operating on the physical values only then bulk expanding the boolean result.

REE-to-REE comparisons are also optimized by computing aligned physical value runs to minimize comparisons.

Mixed cases (REE vs flat) materialize a logical index mapping similar to dictionaries.

This commit also supports REE<Dict>.

For comparison, here are the benchmark results with flat arrays as a reference on my local machine:

eq Int32                time:   [14.955 µs 15.162 µs 15.396 µs]
eq scalar Int32         time:   [11.379 µs 11.418 µs 11.459 µs]

ree_comparison/eq_ree_scalar(phys=64,log=65536)     time:   [453.31 ns 454.88 ns 456.43 ns]
ree_comparison/eq_ree_scalar(phys=1024,log=65536)   time:   [4.1224 µs 4.1298 µs 4.1368 µs]
ree_comparison/eq_ree_scalar(phys=32768,log=65536)  time:   [93.506 µs 94.085 µs 94.993 µs]
ree_comparison/eq_ree_ree(phys=64,log=65536)        time:   [413.96 ns 414.82 ns 415.87 ns]
ree_comparison/eq_ree_ree(phys=1024,log=65536)      time:   [4.1597 µs 4.1660 µs 4.1749 µs]
ree_comparison/eq_ree_ree(phys=32768,log=65536)     time:   [128.74 µs 144.40 µs 161.53 µs]

As is expected, the more we take advantage of REE encoding, the faster the comparisons are.

Which issue does this PR close?

Rationale for this change

Feature enhancement for comparisons

What changes are included in this PR?

Support for REE comparisons, including tests and benchmarks.

Are these changes tested?

Yes.

Are there any user-facing changes?

REE comparisons are now supported.

@github-actions github-actions bot added the arrow Changes to the arrow crate label Mar 27, 2026
@asubiotto
Copy link
Copy Markdown
Contributor Author

Unfortunately did not see there is an open PR #9448 before working on this, however I think this PR is more complete since #9448 does not seem to support REE with scalar or mixed comparisons. Also, this PR is more performant:

PR #9448 (REE-vs-REE only, panics on REE-vs-scalar):
  ree_comparison/eq_ree_ree(phys=64,log=65536)      time:   [2.5053 µs 2.5281 µs 2.5516 µs]
  ree_comparison/lt_ree_ree(phys=64,log=65536)       time:   [2.5559 µs 2.5783 µs 2.6024 µs]
  ree_comparison/eq_ree_ree(phys=1024,log=65536)     time:   [9.6708 µs 9.7004 µs 9.7291 µs]
  ree_comparison/lt_ree_ree(phys=1024,log=65536)     time:   [9.4187 µs 9.4457 µs 9.4717 µs]
  ree_comparison/eq_ree_ree(phys=32768,log=65536)    time:   [225.23 µs 228.23 µs 232.46 µs]
  ree_comparison/lt_ree_ree(phys=32768,log=65536)    time:   [200.61 µs 201.16 µs 201.81 µs]

  Ours:
  ree_comparison/eq_ree_scalar(phys=64,log=65536)    time:   [449.04 ns 456.42 ns 464.89 ns]
  ree_comparison/lt_ree_scalar(phys=64,log=65536)    time:   [447.29 ns 449.15 ns 451.32 ns]
  ree_comparison/eq_ree_ree(phys=64,log=65536)       time:   [425.39 ns 426.91 ns 428.48 ns]
  ree_comparison/lt_ree_ree(phys=64,log=65536)       time:   [439.47 ns 440.59 ns 441.69 ns]
  ree_comparison/eq_ree_flat(phys=64,log=65536)      time:   [46.541 µs 46.659 µs 46.783 µs]
  ree_comparison/eq_ree_scalar(phys=1024,log=65536)  time:   [4.4344 µs 4.4435 µs 4.4543 µs]
  ree_comparison/lt_ree_scalar(phys=1024,log=65536)  time:   [5.3170 µs 5.6797 µs 6.1374 µs]
  ree_comparison/eq_ree_ree(phys=1024,log=65536)     time:   [4.7932 µs 4.8049 µs 4.8183 µs]
  ree_comparison/lt_ree_ree(phys=1024,log=65536)     time:   [5.7911 µs 6.7388 µs 7.8086 µs]
  ree_comparison/eq_ree_flat(phys=1024,log=65536)    time:   [51.404 µs 52.651 µs 54.233 µs]
  ree_comparison/eq_ree_scalar(phys=32768,log=65536) time:   [94.964 µs 95.714 µs 96.556 µs]
  ree_comparison/lt_ree_scalar(phys=32768,log=65536) time:   [95.557 µs 96.428 µs 97.527 µs]
  ree_comparison/eq_ree_ree(phys=32768,log=65536)    time:   [137.23 µs 139.33 µs 141.49 µs]
  ree_comparison/lt_ree_ree(phys=32768,log=65536)    time:   [105.79 µs 106.37 µs 107.20 µs]
  ree_comparison/eq_ree_flat(phys=32768,log=65536)   time:   [143.00 µs 215.04 µs 340.63 µs]

cc @alamb @Jefffrey @brunal

@brunal
Copy link
Copy Markdown
Contributor

brunal commented Mar 27, 2026

Super interesting. I suspect that the perf gain might go lower as # of phys indices grow as I'm avoiding copying them, but this definitely seems like a better implementation.

fn apply<T: ArrayOrd>(
op: Op,
l: T,
l_s: bool,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken, you can have only one of l_s == true / l_v.is_some() / l_ree.is_some() at a time. How about introducing an enum? It will reduce the arg # and make them clearer

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that's right. Scalars can have ree and/or dictionary types and l_v and l_ree are both Some for Ree

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right! Still, I think that the too_many_arguments link is legit, and I'd consider groupin the args for each side -- either anonymous 3-tuple or a real struct. It would make the call site much cleaner. e.g.

struct ArrayInfo<'a> {
  is_scalar: bool,
  dict_values: &'a dyn AnyDictionyArray,
  run_array: Option<&'a ReeInfo>,
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@asubiotto asubiotto force-pushed the asubiotto/reecmp branch 2 times, most recently from 85034d0 to 791ec8c Compare March 27, 2026 13:48
@asubiotto
Copy link
Copy Markdown
Contributor Author

Thanks for the review @brunal. I addressed all your comments.

fn apply<T: ArrayOrd>(
op: Op,
l: T,
l_s: bool,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right! Still, I think that the too_many_arguments link is legit, and I'd consider groupin the args for each side -- either anonymous 3-tuple or a real struct. It would make the call site much cleaner. e.g.

struct ArrayInfo<'a> {
  is_scalar: bool,
  dict_values: &'a dyn AnyDictionyArray,
  run_array: Option<&'a ReeInfo>,
}

}
}

fn ree_physical_indices(info: &ReeInfo) -> Vec<usize> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can turn this into a .skip().map_while(...) to return an Iterator<Item = usize>. Then in logical_indices's Some-Some branch, you can directly map on it. This avoid materializing the intermediate physical indices vector.

This is very nitpicky, as the one branch that benefits from this, REE<dict>, seems like quite a niche use case :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm going to keep this as is currently written to mirror the dict normalized_keys materialization. We can do this as a follow up improvement if we see it's worth it.

@asubiotto
Copy link
Copy Markdown
Contributor Author

Updated, apologies for the delay.

Copy link
Copy Markdown
Contributor

@brunal brunal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks excellent to me, now you just need someone with repo rights :-)

This commit implements native comparisons on REE-encoded arrays which are
treated similarly to dictionary indirection.

This commit implements REE to scalar comparisons by operating on the physical
values only then bulk expanding the boolean result.

REE-to-REE comparisons are also optimized by computing aligned physical value
runs to minimize comparisons.

Mixed cases (REE vs flat) materialize a logical index mapping similar to
dictionaries.

This commit also supports REE<Dict>.

For comparison, here are the benchmark results with flat arrays as a reference
on my local machine:
```
eq Int32                time:   [14.955 µs 15.162 µs 15.396 µs]
eq scalar Int32         time:   [11.379 µs 11.418 µs 11.459 µs]

ree_comparison/eq_ree_scalar(phys=64,log=65536)     time:   [453.31 ns
454.88 ns 456.43 ns]
ree_comparison/eq_ree_scalar(phys=1024,log=65536)   time:   [4.1224 µs
4.1298 µs 4.1368 µs]
ree_comparison/eq_ree_scalar(phys=32768,log=65536)  time:   [93.506 µs
94.085 µs 94.993 µs]
ree_comparison/eq_ree_ree(phys=64,log=65536)        time:   [413.96 ns
414.82 ns 415.87 ns]
ree_comparison/eq_ree_ree(phys=1024,log=65536)      time:   [4.1597 µs
4.1660 µs 4.1749 µs]
ree_comparison/eq_ree_ree(phys=32768,log=65536)     time:   [128.74 µs
144.40 µs 161.53 µs]
```

As is expected, the more we take advantage of REE encoding, the faster the
comparisons are.

Signed-off-by: Alfonso Subiotto Marques <alfonso.subiotto@polarsignals.com>
@asubiotto
Copy link
Copy Markdown
Contributor Author

Thanks again for your review. cc @alamb @Jefffrey this PR is ready for a final review

@asubiotto
Copy link
Copy Markdown
Contributor Author

Friendly ping for a review or triage @alamb (not sure if there's another code owner)

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 13, 2026

Thanks for the ping @asubiotto -- I'll try and find time to review this, but I am having a hard time finding time

@alamb
Copy link
Copy Markdown
Contributor

alamb commented Apr 13, 2026

But nwo I see @brunal has already done it so that will make it easier

@asubiotto
Copy link
Copy Markdown
Contributor Author

Thanks! Appreciate the time. Yes, @brunal conducted an in-depth review so I only need a final look from a maintainer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arrow Changes to the arrow crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support RunEndEncoded arrays in comparison kernels (eq, lt, etc.)

3 participants