Microservices break applications into independent modules, allowing horizontal scaling of bottlenecks without disruption. Kubernetes manages requests and uses HPA to scale Pods based on thresholds. Current HPA modules, efficient with resource metrics and real-time data, lack thorough service quality analysis. Sudden traffic spikes can cause instability due to frequent scaling or degraded service resulting from delayed responses. This study proposes a modular HPA with a prediction module for service quality, a filtering module for valid options, and a stability module to balance performance and consistency. Results show similar CPU usage and response times to the baselines, but fewer scaling actions, which improves stability and efficiency.
Jang et al. (Mon,) studied this question.