Improve Summary quantiles with DataSketches

`Summary.observe()` can become expensive when quantiles are recorded at high frequency. This can make the current quantile path visible on hot request paths.

We saw this in ZooKeeper's Prometheus metrics path. In an internal ZooKeeper 3.9.2 fork, a version inspired by ZooKeeper's unmerged DataSketches Summary [PR](https://github.com/apache/zookeeper/pull/2086) improved peak throughput by about 2x.

DataSketches KLL may be a useful way to improve this in `client_java`. The goal would be to reduce the cost of the observe path while keeping the external Summary behavior as close as practical.

This would not have to replace the current CKMS-based `Summary` immediately. DataSketches KLL has a different accuracy model, memory cost, and quantile visibility behavior, so an explicit opt-in path may be a better first step.

Initial questions:

- Does using DataSketches for Summary quantiles sound like a direction worth exploring?
- If so, would a separate opt-in artifact be a reasonable way to introduce it?
- What behavior details and benchmark data would be most useful before going further?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Summary quantiles with DataSketches #2084

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve Summary quantiles with DataSketches #2084

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions