Skip to content

Conversation

@geonove
Copy link
Contributor

@geonove geonove commented Oct 18, 2025

Description

This PR implements DDSketch and closes #457.
The design follows the core ideas of the original DDSketch paper and the implementations in other libraries.
I have aimed to keep the API consistent with the existing sketches, especially the other quantile sketches.
There might still be some work to do (e.g., adding additional methods or refinements), but I’d like to hear your opinions on the design and implementation so far before proceeding further.

Implementation

This implementation currently requires C++20, primarily for the use of concepts and constrained templates.
DDSketch is templated over a Store concept, rather than relying on a common base class and virtual methods.

If C++20 adoption is not desired, I can rework the code to remain compatible with C++11 by limiting the implementation to collapsing dense stores and removing concept-based templates. I’d appreciate the maintainers’ input on which direction to take before finalizing.

Copy link
Contributor

@leerho leerho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution! We certainly are interested in adding new sketches into our library. However, we have several requests for submissions of entirely new sketches into this repository:

C++ version compatibility:

  • The datasketches-cpp repository is a base repo in that there are several repositories that are also part of the Apache/DataSketches project that depend on this base repo. These include the repositories datasketches-python, datasketches-postgresql, datasketches-bigquery and datasketches-characterization. In addition there are probably other platforms / languages that we may not know about that have code that depends on this base repo.
  • Currently, the C++ code in this repo has been tested to run on C++ 11 through C++ 20. This means that there can be no code constructs that are exclusive to compiler versions greater than C++ 11.
  • To update the base branch to require a minimum C++ compiler version greater than 11 would require verifying, and modifying as necessary, the code in these other repositories to make sure that they can compile from source and run.
  • We would also want to notify the community, well in advance, that such a move is being contemplated and the README would need to be updated.

The New Sketch

  • For a new sketch, we would like to see the website updated with a description of the sketch and a discussion about the performance and characteristics of this sketch and how it compares with the other quantile sketches already in the library. Adding characterization plots of speed and memory usage would also be highly desirable.

Please let us know if you are able to accommodate these requests.

@geonove
Copy link
Contributor Author

geonove commented Nov 20, 2025

Thank you for the detailed feedback @leerho .

I understand the requirements for adding a new sketch to the repository, and I am happy to make the requested adjustments.

For C++ version compatibility, I can update the implementation so that it is fully compliant with C++11.

On the documentation and evaluation side, I have already implemented Python bindings (I haven't opened the PR in datasketches-python yet) and used them to run an initial comparison between DDSketch and T-Digest in a jupyter notebook. I am currently working on a more thorough characterization similar to the analysis in issue #416, including both performance and memory usage comparisons.

I am also preparing a detailed description of the sketch that can be added to the website. This will cover the behavior of the sketch, its performance characteristics, and how it compares with the other quantile sketches in the library.

Please let me know if there is anything else you would like me to include or adjust.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Proposal: Add DDSketch (Relative-Error Quantile Sketch)

2 participants