feat: random improvements 2025 08 17 #23
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request refactors the codebase to move all decision tree–related modules into a new
modelssubpackage, updates all import statements accordingly, and introduces utility functions to handle random sampling of features and samples. It also includes some minor improvements to scikit-learn compatibility and code clarity. The changes affect both the library code and the example notebooks.1. Codebase Restructuring and Import Updates
estimators.py,node.py,predict.py,split.py,train.py,visualize.py, and related objects) fromrandom_tree_models/decisiontree/torandom_tree_models/models/decisiontree/and updated all corresponding import statements throughout the codebase and notebooks. [1] [2] [3] [4] [5] [6] [7] F011abadL3R3, [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]2. Utility Functions for Random Sampling
get_random_sample_idsandget_random_feature_idsinrandom_tree_models/models/decisiontree/random.pyto encapsulate logic for randomly selecting samples and features, and refactored their usage in the decision tree estimator code. [1] [2]3. Improvements to scikit-learn Compatibility
ClassifierTagsand the__sklearn_tags__method, improving integration with scikit-learn's estimator checks. [1] [2]4. Minor Enhancements and Cleanups
dtreealias with direct imports ofDecisionTreeClassifierandDecisionTreeRegressorin ensemble models (ExtraTrees, GradientBoostedTrees), and simplified type annotations. (F011abadL3R3, [1] [2] [3] [4] [5] [6]n_treesis greater than zero inGradientBoostedTreesRegressorusingis_greater_zero.These changes modernize the code structure, improve maintainability, and enhance compatibility with scikit-learn conventions.