What question did this study set out to answer?

This research aims to enhance gradient boosting by introducing randomness in base learner selection.

February 27, 2026Open Access

RandomMachine: Random Base-Learner Selection for Newton Gradient Boosting Ensembles

Puntos clave

This research aims to enhance gradient boosting by introducing randomness in base learner selection.
Developed an open-source Python library called RandomMachine.
Randomly samples base learners from a user-defined pool in each boosting iteration.
Utilizes multiple learner families including LightGBM, CatBoost, and XGBoost.
Tested on synthetic regression and classification tasks.
Achieved a 1.55% improvement in R2 for regression tasks.
Demonstrated a 2.03% increase in accuracy for binary classification tasks.
Improved performance over three fixed-family baselines at similar hyper-parameter budgets.

Resumen

We present RandomMachine, an open-source Python library that extends classicalsecond-order (Newton) gradient boosting by randomly sampling the next base learner from auser-defined pool at each boosting iteration. Unlike standard gradient boosted trees, where ev-ery iteration adds a fresh clone of a single fixed model type, RandomMachine stochasticallymixes multiple learner families—LightGBM, CatBoost, XGBoost, and arbitrary sklearn-compatible estimators—according to per-model sampling probabilities. This randomisedselection increases ensemble diversity, acts as an implicit regulariser, and allows the user toleverage complementary inductive biases of different algorithms within a single coherent boost-ing procedure. We describe the algorithm, its theoretical motivation, and the software design,and report empirical results on synthetic regression and classification tasks demonstratingimprovements of 1.55 % in R2 on regression and 2.03 % in accuracy on binary classificationover three fixed-family baselines at comparable hyper-parameter budgets.

Me gusta

Guardar

Ver artículo completo