Abstract Neural networks (NNs) are widely used for modeling complex, high-dimensional relationships but often lack interpretability, limiting their adoption in critical domains such as healthcare, finance, and engineering. Symbolic regression (SR), in contrast, generates explicit mathematical expressions that enhance transparency but typically underperform in predictive accuracy. To bridge this gap, we propose a knowledge distillation framework that approximates the activations of a trained feedforward NN’s final hidden layer using SR models. This approach enhances interpretability while retaining a substantial fraction of the neural network’s predictive performance in structured data settings. Our method is evaluated across 20 diverse datasets, demonstrating a 7-21% improvement in RMSE over baseline SR models. These results highlight the potential of symbolic knowledge distillation as a practical tool for enhancing model transparency in structured data applications.
Shmuel et al. (Wed,) studied this question.