Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
104 commits
Select commit Hold shift + click to select a range
c48f238
Add folders and files to ignore
geonove Jun 8, 2025
ad0b01c
DDSketch first commit
geonove May 29, 2025
245eff7
Implement Bin class
geonove Jun 1, 2025
99bac33
Add Store pure virtual class
geonove Jun 1, 2025
5645586
Add basic class declaration for ddsketch
geonove Jun 25, 2025
d268221
DenseStore first implementation (WIP)
geonove Jun 25, 2025
aa01d0a
Fix cmake file
geonove Jun 25, 2025
a9c9c58
Add ddsketch basic test folder structure
geonove Jun 25, 2025
87c6e96
Remove unused include
geonove Jun 26, 2025
2bbb558
Implement center bins
geonove Jun 26, 2025
9de37bf
Some methods should not be virtual
geonove Jun 26, 2025
355dbb8
Implement reset counts
geonove Jun 26, 2025
5942823
Implement get total count and remove member variable
geonove Jun 26, 2025
a0a64f9
Make some methods private
geonove Jun 28, 2025
b381bdc
Implement collapsing dense store virtual class
geonove Jun 28, 2025
ae33406
Implement collapsing highest dense store
geonove Jul 6, 2025
117c38d
Add license
geonove Jul 6, 2025
84e243c
Inline bin methods
geonove Jul 6, 2025
977e9f5
Implement collapsing lowest dense store
geonove Jul 6, 2025
925b646
Add basic test to check code compiles
geonove Jul 6, 2025
c120c76
Minor fixes to constructor
geonove Jul 6, 2025
169c704
WIP
geonove Jul 7, 2025
dd26f05
Minor fix
geonove Jul 12, 2025
31d87a1
Implement unbounde dense store
geonove Jul 12, 2025
ec3bea6
SparseStore first implementation
geonove Jul 12, 2025
34601bb
Improve Bin test
geonove Jul 12, 2025
ee2b186
Test wip
geonove Jul 13, 2025
aee121c
Add merge methods to sparse store
geonove Jul 12, 2025
aeae3fb
Implement sparse store iterator
geonove Jul 12, 2025
2b08a44
Store tests wip
geonove Jul 15, 2025
f1b72cc
Test fixes
geonove Jul 17, 2025
52797ea
Remove debugging print function
geonove Aug 17, 2025
da21fb7
Make specialized merge method public
geonove Aug 17, 2025
e8b4656
Use const iterator
geonove Aug 17, 2025
c379305
Implement copy methods
geonove Aug 17, 2025
ca23227
Store factory class
geonove Aug 17, 2025
da497ab
Use double for count
geonove Aug 17, 2025
c58bb15
Clean tests
geonove Aug 17, 2025
781104e
Fix clear method
geonove Aug 18, 2025
0c79e02
Fix shift bins method
geonove Aug 18, 2025
89bb453
Fix collapse test method
geonove Aug 18, 2025
65a2071
Implement test case
geonove Aug 18, 2025
cebf452
fixup! Use double for count
geonove Aug 18, 2025
54e6feb
Fix collapsing lowest merge
geonove Aug 19, 2025
b768014
More store tests
geonove Aug 19, 2025
aff5baf
Add method to merge sparse with dense store and viceversa
geonove Aug 21, 2025
add11e6
Add sparse into dense store test
geonove Aug 21, 2025
d77a361
Make parameter const
geonove Aug 23, 2025
6199826
Add test to cross merge sparse and dense stores
geonove Aug 23, 2025
7773807
WIP
geonove Aug 30, 2025
66f3fbb
Add index mapping base abstract class
geonove Aug 30, 2025
53d5b12
Implement fast log2 and fast inverse log2 functions
geonove Aug 31, 2025
f531e4b
Implement log like index mapping pure virtual base class with crtp
geonove Aug 31, 2025
8144846
Implement linearly interpolated mapping
geonove Aug 31, 2025
379dc29
Index mapping factory class
geonove Aug 31, 2025
485127b
Add index mapping test (barebone)
geonove Aug 31, 2025
497cff9
Fixes after tests
geonove Aug 31, 2025
4181dd0
Add tests and minor improvements
geonove Aug 31, 2025
57e5da9
Make static constexpr members public
geonove Aug 31, 2025
118bda1
Minor polish
geonove Aug 31, 2025
99c63d7
Implement logarithmic mapping
geonove Aug 31, 2025
5ee2797
Implement quadratically interpolated mapping
geonove Aug 31, 2025
9f6448d
Implement operator<< for enum class
geonove Aug 31, 2025
b561a39
Add license
geonove Aug 31, 2025
a40ceba
Make member private
geonove Aug 31, 2025
5c3225f
Implement quartically interpolated mapping and tests
geonove Aug 31, 2025
821b248
Add store concept
geonove Sep 6, 2025
62ec118
Add cross merge test
geonove Sep 6, 2025
88c30f8
Dense stores now use static polymorphism instead of dynamic
geonove Sep 7, 2025
25b47fa
Use common merge method for other type different from this type
geonove Sep 7, 2025
1e12917
Implement sparse store reverse iterator
geonove Sep 7, 2025
c209bea
Make index mapping use static polymorphism
geonove Sep 8, 2025
b138a12
Fix includes
geonove Sep 8, 2025
c99ac30
Implement DDSketch and tests
geonove Sep 9, 2025
3e99a5f
Refactoring
geonove Sep 11, 2025
4817bb1
Use std::frexp for log2 approx
geonove Sep 11, 2025
f6ed4fa
Implement get rank
geonove Sep 12, 2025
59f69dc
Encode/decode index mapping
geonove Sep 13, 2025
d53a4e3
Index mapping encoding and decoding
geonove Sep 13, 2025
50d513e
Encode decode WIP
geonove Sep 13, 2025
20bdb1c
Encode decode store
geonove Sep 13, 2025
75bba95
Compute serialized size bytes
geonove Sep 13, 2025
8c21c9b
Implement ddsketch serialize and deserialize
geonove Sep 14, 2025
4b0178a
Merge remote-tracking branch 'origin/master' into geonove/ddsketch
geonove Sep 14, 2025
d7cc91b
Fix cmake
geonove Sep 14, 2025
708f27f
Templated collapsing dense stores
geonove Sep 14, 2025
e70e46f
Fix constructor
geonove Sep 14, 2025
9ca0f32
Do not use factory in constructor
geonove Sep 15, 2025
46af536
Implement to string
geonove Sep 15, 2025
ae79be5
Add include
geonove Sep 15, 2025
e88f9cb
Fix to string
geonove Sep 15, 2025
942336e
Minor fixes
geonove Sep 16, 2025
6acab11
Add doc
geonove Sep 16, 2025
e31b5ec
Remove build folders
geonove Sep 16, 2025
9f38af0
Re-add build/.gitignore
geonove Sep 16, 2025
9fb3a26
Remove unnecessary files
geonove Sep 16, 2025
cc5b6bd
Remove unnecessary test
geonove Oct 18, 2025
8160e42
WIP c++11
geonove Nov 29, 2025
b0c18de
Use c++11 features only
geonove Nov 29, 2025
1fb5938
Implement get_PMF and get_CDF
geonove Dec 1, 2025
eecb9cc
Minor fixes
geonove Dec 6, 2025
b5f9220
Bring back sparse store
geonove Dec 11, 2025
3c1d929
Bring back sparse store
geonove Dec 11, 2025
d70f4fc
Use double instead of T
geonove Dec 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,9 @@
# Visual Studio Code
.vscode/

# Intellij
.idea/

# OSX files
.DS_Store

Expand Down Expand Up @@ -43,3 +46,9 @@ _*/

docs
java

# clang
.clangd

# CMakeFiles
CMakeFiles/
3 changes: 2 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -120,12 +120,13 @@ add_subdirectory(count)
add_subdirectory(density)
add_subdirectory(tdigest)
add_subdirectory(filters)
add_subdirectory(ddsketch)

if (WITH_PYTHON)
add_subdirectory(python)
endif()

target_link_libraries(datasketches INTERFACE hll cpc kll fi theta sampling req quantiles count)
target_link_libraries(datasketches INTERFACE hll cpc kll fi theta sampling req quantiles count ddsketch)

if (COVERAGE)
find_program(LCOV_PATH NAMES "lcov")
Expand Down
1 change: 1 addition & 0 deletions Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -955,6 +955,7 @@ INPUT = common/include \
fi/include \
count/include \
req/include \
ddsketch/include \
README.md

# This tag can be used to specify the character encoding of the source files
Expand Down
67 changes: 67 additions & 0 deletions ddsketch/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

add_library(ddsketch INTERFACE)

add_library(${PROJECT_NAME}::DDSKETCH ALIAS ddsketch)

if (BUILD_TESTS)
add_subdirectory(test)
endif()

target_include_directories(ddsketch
INTERFACE
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>
$<INSTALL_INTERFACE:$<INSTALL_PREFIX>/include>
)

target_link_libraries(ddsketch INTERFACE common)

install(TARGETS ddsketch
EXPORT ${PROJECT_NAME}
)

install(FILES
include/bin.hpp
include/bin_impl.hpp
include/collapsing_dense_store.hpp
include/collapsing_dense_store_impl.hpp
include/collapsing_highest_dense_store.hpp
include/collapsing_highest_dense_store_impl.hpp
include/collapsing_lowest_dense_store.hpp
include/collapsing_lowest_dense_store_impl.hpp
include/ddsketch.hpp
include/ddsketch_impl.hpp
include/dense_store.hpp
include/dense_store_impl.hpp
include/index_mapping.hpp
include/index_mapping_factory.hpp
include/index_mapping_impl.hpp
include/linearly_interpolated_mapping.hpp
include/linearly_interpolated_mapping_impl.hpp
include/log_like_index_mapping.hpp
include/log_like_index_mapping_impl.hpp
include/logarithmic_mapping.hpp
include/logarithmic_mapping_impl.hpp
include/quadratically_interpolated_mapping.hpp
include/quadratically_interpolated_mapping_impl.hpp
include/sparse_store.hpp
include/sparse_store_impl.hpp
include/store_factory.hpp
include/unbounded_size_dense_store.hpp
include/unbounded_size_dense_store_impl.hpp
DESTINATION "${CMAKE_INSTALL_INCLUDEDIR}/DataSketches")
73 changes: 73 additions & 0 deletions ddsketch/include/bin.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
#ifndef BIN_H
#define BIN_H

#include <cstdint>
#include <string>

namespace datasketches {

/**
* @class Bin
* @brief Represents a bucket of counts in a DDSketch store.
*
* A Bin corresponds to a mapped value index and its associated count.
* It is the fundamental unit used in DenseStore, SparseStore, and their variants.
*/
class Bin {
public:
/**
* @brief Construct a new Bin.
* @param index The index representing the mapped value bucket.
* @param count The number of samples in this bin.
*/
Bin(int index, double count);

~Bin() = default;

/**
* @brief Equality operator.
* @param other The other bin to compare with.
* @return True if both bins have the same index and count.
*/
bool operator==(const Bin& other) const;
std::string to_string() const;

/**
* @brief Get the count of this bin.
* @return The number of samples in the bin.
*/
double get_count() const;

/**
* @brief Get the index of this bin.
* @return The integer index.
*/
int get_index() const;

private:
int index;
double count;
};
}

#include "bin_impl.hpp"

#endif //BIN_H
48 changes: 48 additions & 0 deletions ddsketch/include/bin_impl.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

#ifndef BIN_IMPL_H
#define BIN_IMPL_H

#include "bin.hpp"

namespace datasketches {
inline Bin::Bin(int index, double count): index(index), count(count) {};

inline bool Bin::operator==(const Bin& other) const {
if (this == &other) {
return true;
}
return index == other.index && count == other.count;
};

inline double Bin::get_count() const {
return count;
}

inline int Bin::get_index() const {
return index;
}

inline std::string Bin::to_string() const {
return "Bin{index= " + std::to_string(index) + ", count= " + std::to_string(count) + "}";
}

}
#endif //BIN_IMPL_H
90 changes: 90 additions & 0 deletions ddsketch/include/collapsing_dense_store.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/

#ifndef COLLAPSING_DENSE_STORE_HPP
#define COLLAPSING_DENSE_STORE_HPP

#include "dense_store.hpp"

namespace datasketches {

/**
* @class CollapsingDenseStore
* @brief Common logic for capacity-bounded dense stores with tail-collapsing.
*/
template<class Derived, int N, typename Allocator>
class CollapsingDenseStore : public DenseStore<Derived, Allocator> {
public:

using size_type = typename DenseStore<Derived, Allocator>::size_type;
CollapsingDenseStore();

/**
* Copy assignment
* @param other sketch to be copied
* @return reference to this sketch
*/
CollapsingDenseStore<Derived, N, Allocator>& operator=(const CollapsingDenseStore<Derived, N, Allocator>& other);

/**
* This method serializes the store into a given stream in a binary form
* @param os output stream
*/
void serialize(std::ostream& os) const;

/**
* @brief Deserialize the store from a stream (replacing current contents).
* @param is Input stream.
*/
static Derived deserialize(std::istream& is);

/**
* Computes size needed to serialize the current state of the sketch.
* @return size in bytes needed to serialize this sketch
*/
int get_serialized_size_bytes() const;

~CollapsingDenseStore() = default;

/**
* @brief Clear all contents of the store.
*
* Removes all bins and resets counts to zero while preserving configuration
* (e.g., capacity limits). After this call, @c total_count() is 0 and the
* store contains no non-empty bins.
*/
void clear();

protected:
bool is_collapsed;

/**
* @brief Compute the resized backing-array length for a target index span.
*
* @param new_min_index Lowest bin index to be retained (inclusive).
* @param new_max_index Highest bin index to be retained (inclusive).
* @return size_type New backing-array capacity (in bins).
*/
size_type get_new_length(size_type new_min_index, size_type new_max_index) const;
};
}

#include "collapsing_dense_store_impl.hpp"

#endif //COLLAPSING_DENSE_STORE_HPP
Loading