feat: polished xgboost.py #25

eschmidt42 · 2025-08-18T13:35:08Z

This pull request refactors and centralizes key gradient and transformation utilities for gradient boosting and XGBoost models, improving code reuse, modularity, and maintainability. It moves gradient calculation functions and transformation utilities into dedicated modules, updates imports throughout the codebase, and removes redundant or duplicate code. Additionally, it updates the fitting and prediction logic in model classes to use the new shared utilities, and cleans up related tests.

Refactoring and code organization:

Moved gradient-related functions (get_pseudo_residual_mse, get_pseudo_residual_log_odds, get_start_estimate_mse, get_start_estimate_log_odds, and check_y_float) from gradientboostedtrees.py and xgboost.py into a new module gradient.py, updating all model classes to import and use these functions. [1] [2] [3]
Moved transformation utilities (vectorize_bool_to_float, get_probabilities_from_mapped_bools) into a new transform.py module, and updated all relevant imports. [1] [2] [3]
Removed now-redundant implementations of gradient and transformation functions from gradientboostedtrees.py, xgboost.py, and utils.py. [1] [2] [3] [4] [5]

Model logic improvements:

Updated the fit and predict methods in GradientBoostedTreesRegressor, GradientBoostedTreesClassifier, XGBoostRegressor, and XGBoostClassifier to use the new centralized gradient and transformation utilities, including proper handling of first and second derivatives where appropriate. [1] [2] [3] [4] [5] [6]
Ensured consistent use of the ensure_all_finite parameter in data validation across all relevant model methods. [1] [2] [3] [4]

Test cleanup:

Removed redundant or now-unnecessary tests for gradient functions from test_gradientboostedtrees.py, as these are now covered by the new centralized implementations.

These changes improve code clarity, reduce duplication, and make it easier to maintain and extend the gradient boosting and XGBoost implementations.

…tedtrees.py, also got rid of some redundant code that both modules should share, re-located re-used functions to gradient.py and transform.py, added tests

feat: polished xgboost.py a bit to use similar naming as gradientboos…

afad757

…tedtrees.py, also got rid of some redundant code that both modules should share, re-located re-used functions to gradient.py and transform.py, added tests

eschmidt42 merged commit 9ff0496 into main Aug 18, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: polished xgboost.py #25

feat: polished xgboost.py #25

Uh oh!

eschmidt42 commented Aug 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: polished xgboost.py #25

feat: polished xgboost.py #25

Uh oh!

Conversation

eschmidt42 commented Aug 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants