Elastic cloud systems are increasingly employing machine learning (ML) to automate resource scaling in response to variable workloads and stringent service-level objectives. However, current ML-based autoscalers are fragmented across different platforms, objectives, and evaluation frameworks. This survey examines 60 primary studies conducted between 2015 and 2025, categorising them according to a five-dimensional taxonomy that includes goal, decision logic, scaling mode, control scope, and deployment. This study classifies supervised, unsupervised, and reinforcement learning approaches and analyzes their integration into practical frameworks, including Kubernetes-based controllers and cloud provider services. This paper summarizes the application of machine learning to workload prediction, proactive and hybrid horizontal–vertical scaling, and adaptive policy optimization. Additionally, it synthesises common evaluation practices, encompassing workloads, metrics, and benchmarks. The analysis identifies ongoing challenges: actuation delays and telemetry lag, the intricacies of hybrid scaling, coordination across multi-service and edge-cloud deployments, and the constrained joint consideration of cost, SLO, and energy objectives. The identified gaps necessitate additional research on unified machine learning-driven orchestration, multi-agent and federated control, standardised benchmarks, and sustainability-aware autoscaling.
Machiraju et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: