Traditional machine learning approaches for 5G network management relieve data from operational networks, which are often noisy and confounded, making it difficult to identify key influencing factors. This research addresses the critical gap between correlation-based prediction and interpretable, data-driven explanation. To this end, a software-defined standalone 5G architecture was developed using srsRAN and Open5GS to support multi-user scenarios. A multi-user environment was then simulated with GNU Radio, from which the initial dataset was collected. This dataset was further generated using a Conditional Tabular Generative Adversarial Network (CTGAN) to improve diversity and balance. Several machine learning models, including Linear Regression, Decision Tree, Random Forest, Gradient Boosting, and XGBoost, were trained and evaluated for predicting network performance. Among them, XGBoost achieved the best results, with an R2 score of 0. 998. To interpret the model, we conducted a SHAP (SHapley Additive exPlanations) analysis, which revealed that the download-to-upload bitrate ratio (dlᵤlᵣatio) and upload bitrate (brateᵤl) were the most influential features. By leveraging a controlled experimental 5G environment, this study demonstrates how machine learning can move beyond predictive accuracy to uncover the fundamental principles governing 5G system performance, providing a robust foundation for future network optimization.
Nurakhov et al. (Fri,) studied this question.