Skip to content

Conversation

@leerho
Copy link
Contributor

@leerho leerho commented Aug 24, 2025

Set pom to min java version of 22.

Set minimum Maven version to 3.9.11.

Remove obsolete ds-memory reference.

Set minimum Maven version to 3.9.11.

Remove obsolete ds-memory reference.
@leerho leerho requested a review from Copilot August 25, 2025 00:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

leerho added 2 commits October 1, 2025 13:23
Fixed some very bad assumptions with heapify(...) and wrap(...) where if
the simple call was made, i.e., heapify(seg) not only was the default
update seed assumed (this is normal), but the seed hash was not checked!
Yikes!  Same thing with wrap(...).  I'm not sure how that ever happened,
but it is now fixed.  I think a lot of that was associated with the old
SerVer 1 and 2 stuff.

This also fixed the squirrelly testing that somehow figured that the
above was ok!?!

Also added some checks when reading the preamble of a user's input
segment -- made sure that the segment was at least large enough to hold
the full preamble.
@leerho leerho marked this pull request as ready for review October 2, 2025 21:58
@leerho
Copy link
Contributor Author

leerho commented Oct 2, 2025

Lots of cleanup after Sketches class was removed.
Fixed some very bad assumptions with heapify(...) and wrap(...) where if
the simple call was made, i.e., heapify(seg) not only was the default
update seed assumed (this is normal), but the seed hash was not checked!
Yikes! Same thing with wrap(...). I'm not sure how that ever happened,
but it is now fixed. I think a lot of that was associated with the old
SerVer 1 and 2 stuff.

This also fixed the squirrelly testing that somehow figured that the
above was ok!?!

Also added some checks when reading the preamble of a user's input
segment -- made sure that the segment was at least large enough to hold
the full preamble.

This is a big PR. I would suggest reviewers go through the changes commit by commit, rather than the whole thing at once.

I haven't done the class name changes yet. I will do that in a PR all by itself because it doesn't involve any logic changes, just the names of some classes -- And that should be easier to review by itself.

@leerho leerho requested a review from Copilot October 2, 2025 22:05
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 79 out of 79 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

src/main/java/org/apache/datasketches/tuple/QuickSelectSketch.java:1

  • Constructor super(...) must be the first statement. Moving super(...) after field assignments will not compile. Restore the prior pattern that validates inside the argument list (e.g., call this(new Validate<>(), seg, deserializer, summaryFactory);) and keep the private constructor that invokes super(...) first, or call super(...) as the first line and refactor validation accordingly.
/*

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@AlexanderSaydakov
Copy link
Contributor

This also fixed the squirrelly testing that somehow figured that the above was ok!?!

I believe that was intentional for compatibility with some legacy system, which is not needed anymore.

Copy link
Member

@freakyzoidberg freakyzoidberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved - but a clarification need regarding removal serVer 1 and 2 in ThetaSketch.

Apart from this and the fact I need to setup java 25. LGTM

}
//not SerVer 3, assume compact stored form
final short seedHash = Util.computeSeedHash(seed);
if (serVer == 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we ok we dropping support for server 1 and 2 ?

I mean how to do we know nobody rely on some old serialised sketch here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my message on dev@: https://lists.apache.org/thread/4c78gsls7xjrn0b4cosvppywh2clwz6x
To Wit:

Our backward compatibility promise is primarily focused on the binary
compatibility of being able to read serializations of earlier versions of
the code with our latest versions. This is still true. However, up until
recently we have been supporting binary compatibility with very early code
versions at Yahoo that existed long before our code was first open-sourced
in August of 2015 (we became part of ASF in 2020). Since Yahoo is in the
process of obsoleting those systems that used some of that code there is no
reason that we need to keep supporting it.

The older versions being referred to are specifically SerVer 1 and 2 of the Theta Sketch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@leerho proposed this on the mailing list and received supportive comments.

Versions 1 and 2 predate the open source release, so only Yahoo might have such sketches. And since they can be read and converted to v3 using older library versions, it seems like a reasonable time to clean up this legacy code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@leerho leerho merged commit 63c371a into main Oct 6, 2025
7 checks passed
@leerho leerho deleted the cleanup_phase3 branch October 6, 2025 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants