The global transition toward sustainable energy systems and the growing complexity of energy data require advanced analytical approaches that capture nonlinear, multidimensional, and temporally dependent relationships. This study proposes a widespread machine learning (ML) framework for electricity generation from renewable sources. The dataset includes 3649 records from 176 countries between 2000 and 2020, with 21 economic, demographic, and environmental indicators. To evaluate the impact of input dimensionality, two experimental scenarios were developed: one using all available features and another using a reduced subset derived through ten feature selection techniques (filter, wrapper, and hybrid). Four ML algorithms—Artificial Neural Network (ANN), Gradient Boosting Regression (GBR), eXtreme Gradient Boosting (XGBoost), and Random Forest (RF)—were implemented and assessed using Mean Squared Error (MSE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the coefficient of determination (R2). To reduce the risk of data leakage and provide a more realistic evaluation for panel data, PanelSplit cross-validation was applied while preserving the temporal structure of the observations. In addition, Friedman and Wilcoxon signed-rank tests with Bonferroni correction were used to assess the statistical significance of performance differences among models. The results show that all models achieved strong predictive accuracy, with ensemble methods outperforming the neural network. RF had the best overall performance (MSE = 39.6791, MAE = 1.5859, RMSE = 6.2991, R2 = 0.9955), followed by GBR and XGBoost. Correlation analysis confirmed the presence of strong relationships among several energy indicators, supporting the need for dimensionality reduction. SHAP analysis identified Land Area, Electricity from Fossil Fuels, and Renewables as the dominant predictors of renewable electricity generation. These outcomes illustrate that combining feature selection, panel-aware validation, statistical testing, and explainable machine learning supplies a robust and interpretable framework for understanding global renewable electricity generation and supporting data-driven decision-making in sustainable energy planning.
Patrascu et al. (Mon,) studied this question.