Major Changes with this Release
This release is a major release where we took the opportunity to do some significant refactoring that will constitute incompatible changes from previous releases. Any incompatibility with prior releases is always an inconvenience to users who wish to just upgrade to the latest release and run. However, some of the code in this library was written in 2013 and meanwhile the Java language has evolved enormously since then. We chose to use this major release as the opportunity to modernize some of the code to achieve the following goals:
Remove the dependency on the DataSketches-Memory component and use FFM instead.
- The DataSketches-Memory component was originally developed in 2014 to address the need for fast access to off-heap memory data structures and used Unsafe and other JVM internals as there were no satisfactory Java language features to do this at the time.
- The FFM capabilities introduced into the language in Java 22, are now part of the Java 25 LTS release, which we support. Since the capabilities of FFM are a superset of the original DataSketches-Memory component, it made sense to rewrite the code to eliminate the dependency on DataSketches-Memory and use FFM instead. This impacted code across the entire library.
- This provided several advantages to the code base. By removing this dependency on DataSketches-Memory, there are now no runtime dependencies! This should make integrating this library into other Java systems much simpler. Since FFM is tightly integrated into the Java language, it has improved performance, especially with bulk operations.
- As an added note: There are numerous other improvements to the Java language that we could perhaps take advantage of in a rewrite, e.g., Records, text blocks, switch expressions, sealed, var, modules, patterns, etc. However, faced with the risk of accidentally creating bugs due to too many changes at one time, we focused on FFM, which actually improved performance as opposed to just creating syntactic sugar.
Align public sketch class names so that the sketch family name is part of the class/file name.
- For example, the Theta sketch family was the first family written for the library and its base class was called Sketch. The Tuple sketch family evolved soon after and its base class was also called Sketch. If a user wanted to use both the Theta and Tuple families in the same class one of them had to be fully qualified every time it was referenced.
- Unfortunately, this style propagated so some of the other early sketch families where we ended up with two different sketch families with a ItemsSketch, etc. For the more recent additions to the library we started including the sketch family name in all the relevant sketch-like public classes of a sketch family.
- In this release we have refactored these older sketches with new names that now include the sketch family name. This is an incompatible change for user code moving from earlier releases, but this can be readily fixed with search-and-replace tools. This release is not perfect, but hopefully more consistent across all the different sketch families.
None of these changes have affected the binary compatibility of the serialized versions of these sketches. The sketches in this library can still interpret serialized versions of the same sketch from earlier release versions of this library and from other language versions of this library.
Known Issues
SpotBugs
- Make sure you configure SpotBugs with the /tools/FindBugsExcludeFilter.xml file. Otherwise, you may get a lot of false positive or low risk issues that we have examined and eliminated with this exclusion file. Also, at the time of this release, SpotBugs had not been upgraded to handle Java 25 features.
Checkstyle
- At the time of this release, Checkstyle had not been upgraded to handle Java 25 features.