What question did this study set out to answer?

The main aim is to predict CO2 emissions from power stations using a novel feature selection framework.

March 1, 2026Open Access

An Explainable Multi-Stage Feature Selection Framework for Power-Station CO2 Emissions Forecasting

Key Points

The main aim is to predict CO2 emissions from power stations using a novel feature selection framework.
Developed a three-stage feature selection process combining filter, wrapper, and embedded techniques.
Applied machine learning models including XGBoost, Random Forest, LSTM, and SVR.
Utilized SHAP and LIME for model interpretability, offering insights into emission drivers.
XGBoost achieved an RMSE of 28.5 and MAE of 19.8, with an R2 of 0.96.
The proposed framework outperformed other forecasting models.
Results provide actionable insights for policymakers and operators focused on CO2 reduction.

Abstract

The accurate forecasting of CO2 emissions from power stations is critical for effective climate policy and the transition to sustainable energy systems. However, the complexity of power generation processes and the high dimensionality of operational data present significant challenges to traditional modeling approaches. This paper introduces a novel multi-stage framework that integrates advanced feature selection with explainable machine learning (XAI) to deliver high-accuracy forecasts of power station CO2 emissions while maintaining full model transparency. The proposed methodology comprises a three-stage feature selection process—combining filter, wrapper, and embedded methods—to systematically identify the most influential emission drivers from a large set of potential variables. The selected features are then used to train a suite of machine learning models, including XGBoost, Random Forest, LSTM, and SVR. The best-performing model, XGBoost, achieved a Root Mean Square Error (RMSE) of 28.5, a Mean Absolute Error (MAE) of 19.8, and a coefficient of determination (R2) of 0.96 on a real-world dataset. To address the “black-box” nature of these models, we employ SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) to interpret the model’s predictions, providing granular insights into the key factors driving emissions. The results demonstrate that the proposed framework not only outperforms state-of-the-art forecasting models but also offers a clear, interpretable, and actionable tool for policymakers and plant operators to support CO2 reduction strategies. The novelty of this work lies in its unique combination of a multi-stage feature selection pipeline and a comprehensive XAI-based analysis, providing a robust and transparent solution for a critical environmental challenge.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Qader et al. (Fri,) studied this question.

synapsesocial.com/papers/69a3d7baec16d51705d2e097 https://doi.org/https://doi.org/10.3390/en19051210

Bookmark

View Full Paper