Highly accurate prediction models typically require high-dimensional structures, which would obscure the underlying physical mechanisms. Consequently, balancing prediction accuracy with model interpretability remains a fundamental challenge in binding energy prediction. Herein, we propose a residual-compensated SISSO-RFR framework for predicting binding energies of key atomic species (C, N, and O). First, an interpretable SISSO model incorporating five descriptors is developed, and the descriptors further demonstrate the significant contribution of surface elements exhibiting strong affinity to the adsorbate. Subsequently, a Random Forest Regression (RFR) model compensates for residual errors between the calculated values and the SISSO predictions. The resulting SISSO-RFR framework demonstrates a significantly higher accuracy than both conventional ensemble models and dual SISSO approaches. This synergistic integration strikes an optimal balance between model transparency inherent in white-box SISSO approaches and predictive accuracy achieved through RFR's error compensation mechanisms. The framework effectively balances accuracy with interpretability, offering a streamlined approach to deepen understanding of the structure-performance relationship and accelerate catalyst discovery.
Yang et al. (Thu,) studied this question.