This study develops a data-driven predictive framework integrating hybrid feature selection, interpretable machine learning, and uncertainty quantification to forecast Olympic medal performance among elite nations. Focusing on the top ten countries from Paris 2024, the analysis employs a three-stage feature selection procedure combining Spearman correlation screening, random forest embedded importance, and the Caterpillar Fungus Optimizer (CFO) to identify stable long-term predictors. A novel test variable, rank, capturing historical competitive strength, and a refined continuous host-effect indicator derived from gravity-type trade models are introduced. Two complementary modeling strategies—a two-way fixed-effects econometric model and a CFO-optimized random forest—are implemented and validated. SHAP, LIME, and partial dependence plots enhance model interpretability, revealing nonlinear mechanisms underlying medal outcomes. Kernel density estimation generates probabilistic interval forecasts for Los Angeles 2028. Results demonstrate that historical performance and event-specific characteristics dominate medal predictions, while macroeconomic factors (GDP, population) and conventional host status contribute marginally once related variables are controlled. Consistent variable rankings across models and close alignment between 2028 projections and 2024 outcomes validate the framework’s robustness and practical applicability for sports policy and resource allocation decisions.
Chen et al. (Thu,) studied this question.