Machine learning models play a crucial role in predictive analytics, particularly in cloud-based environments where vast amounts of performance data must be efficiently processed and analysed. This study measures the usage patterns of cloud Central Processing Unit (CPU) utilization, which contains Virtual Machines (VMs) usage pattern, and presents a comprehensive evaluation of multiple regression models — Linear Regression, Decision Tree (TR), Gradient Boosting Machine (GBM), and Multi-Layer Perceptron (MLP) Regressor — to determine their effectiveness in modelling cloud resource utilisation patterns. The performance of each model was assessed based on two key error metrics: Mean Squared Error (MSE) and Mean Absolute Error (MAE). Our findings reveal that Linear Regression performed the worst, with high error rates, indicating its inability to handle non-linearity in the data. Decision Tree (TR) improved upon Linear Regression but tended to overfit, leading to moderate accuracy. Gradient Boosting (GBM) further enhanced performance, striking a balance between accuracy and computational efficiency. However, the MLP Regressor significantly outperformed all models, achieving an MSE of 0.000137 and an MAE of 0.00640, demonstrating exceptional predictive accuracy and generalisation capability. While neural network-based models like MLP require greater computational resources, they exhibit superior ability in capturing complex relationships within the dataset. The results of this study suggest that MLP Regressor is the most effective model for cloud resource prediction, given its minimal error metrics. Future research should explore hyperparameter tuning, hybrid modelling approaches, and the integration of deep learning techniques to further enhance predictive performance while optimising computational costs.
Nataraj et al. (Mon,) studied this question.