In recent years, Machine Learning (ML) models have been introduced across diverse scientific fields, due to their strong predictive performance. However, in many applications the demand for interpretable models often surpasses the need for mere accuracy. Symbolic Regression (SR) offers a solution with its concise and explainable formulas, yet it typically falls short in accuracy when compared to other ML models such as Random Forests. This paper proposes a novel two-step ensemble approach, inspired by the student-teacher model in knowledge distillation. Initially, a Gradient Boosting Model (GBM) is trained to harness its high accuracy; this model acts as the ‘teacher.’ Subsequently, an SR ’student’ model is trained on the output of the gradient boosting model, effectively distilling the knowledge into a form that combines interpretability with enhanced performance. We perform an extensive evaluation of this method across three different SR models and 10 regression datasets. Our findings show that the proposed method exhibits a consistent improvement of 3.3-6.7% in accuracy compared to standard SR models. Furthermore, this process serves as an explainability framework that helps to uncover and interpret the decisions of complex ML models.
Building similarity graph...
Analyzing shared references across papers
Loading...
Assaf Shmuel
Teddy Lazebnik
University of Haifa
Oren Glickman
SHILAP Revista de lepidopterología
IEEE Access
Bar-Ilan University
University of Haifa
Building similarity graph...
Analyzing shared references across papers
Loading...
Shmuel et al. (Thu,) studied this question.
synapsesocial.com/papers/69a75b2dc6e9836116a22039 — DOI: https://doi.org/10.1109/access.2026.3657793
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: