Skip to content

Conversation

@leerho
Copy link
Contributor

@leerho leerho commented Aug 16, 2025

NOTES

The main repo is being targeted for the upcoming Java 25 LTS release, which is due out next month (Sep). Currently, however, this code is only being compiled under Java 24 in order to get work done. I do not anticipate any changes being required in the move to Java 25.

The big change from the previous release (8.0.0 using Java 21) is the migration of all code that depended on datasketches-memory to FFM (Panama). This was completed with previous PRs. However, the change to FFM has created the opportunity to refactor/remove some classes that primarily existed for DS-memory, and to carefully go through the entire code base reviewing javadocs, code comments, and improving overall code completeness and quality.

So I'm using the time between now and when Java 25 is available to do just that. If anyone one else wants to help out with this effort, you are more than welcome :)

Summary of changes

  • Fixed some problems in GHA workflows and re-enabled them.
  • Added documentation of compressed layout to PreambleUtil
  • Made binary compatible with C++ by setting P field to zero for compact sketches.
  • Some Javadoc improvements
  • Added some missing methods
  • Added a missing test

File by File change comments

ROOT DIRECTORY

.GITHUB/WORKFLOWS

auto-jdk-matrix.yml

  • Uncommented the "push:" triggers
  • Set Java version to 24

auto-os-jdk-matrix.yml

  • Uncommented the "push:" triggers
  • Set Java version to 24

check_cpp_files.yml

  • Uncommented the "push:" triggers
  • Set Java version to 24

codeql-analysis.yml

  • Uncommented the "push:" triggers
  • Set Java version to 24

javadoc.yml

  • Uncommented the "push:" triggers
  • Set Java version to 24
  • Added write permissions
  • Added new "uses" action authorized by GitHub, which uses an SHA tag.

pom.xml

  • Fixed problem in maven-javadoc-plugin

MAIN TREE

COMMON

Util.java

  • corrected javadoc formatting problem

HLL

package-info.java

  • corrected javadoc formatting problem

(classic) QUANTILES

DirectCompactDoublesSketch.java

  • Corrected spelling error

DirectUpdateDoublesSketch.java

  • Reworded an awkward sentence

DoublesSketch.java

  • Close examination revealed that a method was missing and another was not named correctly.
  • Careful rewording of the javadocs.
  • The original wrap() method handled both compact and updatable. This has been split into a wrap() and a writableWrap() method for clarity and also to match the API protocol used in other sketches.
  • The error messages has been improved in case the user calls the wrong one.

DoublesUnionImpl.java

  • The javadoc of one method needed to be reworded
  • One method did not handle whether an incoming segment was compact or not.

UpdateDoublesSketch.java

  • Major javadoc rewrite of two related methods to document how the MemorySegmentRequest works

THETA

CompactOperations

  • Set P to zero for compact sketches to be compatible with C++.

PreambleUtil

  • Added storage layout documentation for the compact compressed sketch.

TEST TREE

(classic) QUANTILES

DirectCompactDoublesSketchTest.java

  • The DoublesSketch.wrap method was called incorrectly

DirectUpdateDoublesSketchTest.java

  • As the result of the new methods added above, the DoublesSketch.writableWrap method was called incorrectly.

DoublesMiscTest.java

  • Removed commented out code
  • As the result of the new methods added above, the DoublesSketch.writableWrap method was called incorrectly in several places.
  • renamed a test
  • Added a missing test

DoublesSketchTest.java

  • As the result of the new methods added above, the DoublesSketch.writableWrap method was called incorrectly.

DoublesUnionImplTest.java

  • As the result of the new methods added above, the DoublesSketch.writableWrap method was called incorrectly.

DoublesUtilTest.java

  • As the result of the new methods added above, the DoublesSketch.writableWrap method was called incorrectly.

QuantilesSketchCrossLanguageTest.java

  • As the result of the new methods added above, the DoublesSketch.writableWrap method was called incorrectly.

@leerho leerho marked this pull request as ready for review August 18, 2025 17:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements cleanup changes for the phase2 effort, upgrading Java version to 24 and improving code quality across the DataSketches library. The changes include workflow fixes, API improvements, and comprehensive code cleanup.

Key changes:

  • Upgraded all GitHub Actions workflows to use Java 24 and re-enabled push triggers
  • Refactored DoublesSketch API to clearly separate compact (read-only) from updatable sketches
  • Fixed binary compatibility with C++ by setting P field to zero for compact sketches
  • Enhanced documentation including new compressed layout documentation and improved Javadocs

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
.github/workflows/*.yml Updated Java version to 24 and re-enabled push triggers
pom.xml Fixed maven-javadoc-plugin configuration
src/main/java/org/apache/datasketches/quantiles/DoublesSketch.java Refactored API to separate wrap() for compact sketches and writableWrap() for updatable sketches
src/main/java/org/apache/datasketches/quantiles/UpdateDoublesSketch.java Enhanced Javadoc documentation for MemorySegmentRequest handling
src/main/java/org/apache/datasketches/quantiles/DoublesUnionImpl.java Added logic to handle both compact and updatable segments
src/main/java/org/apache/datasketches/theta/CompactOperations.java Changed P field to 0.0 for C++ compatibility
src/main/java/org/apache/datasketches/theta/PreambleUtil.java Added comprehensive documentation for compressed layout
test files Updated method calls to use new API (wrap vs writableWrap)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@c-dickens c-dickens requested review from c-dickens and removed request for pan3793 August 20, 2025 17:36
@leerho leerho merged commit 5abce63 into main Aug 20, 2025
7 checks passed
@leerho leerho deleted the Cleanup_phase2 branch August 20, 2025 17:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants