Kubernetes, an open-source project initiated by Google for managing and organizing containers in cloud platforms, has become the preferred choice for deploying large-scale containerized microservice architectures. Kubernetes employs a scheduler that considers constraints defined by workload owners and cluster managers to identify the most suitable node to host a given task. Although it can be configured in a multitude of ways, the default scheduler that comes with Kubernetes is not fully capable of efficiently handling the demands of Horizontal Pod Autoscaling (HPA), particularly when deploying a large number of similar pods simultaneously. This article focuses on the optimization of the Kubernetes scheduler to allocate and manage resources more efficiently in rapid Pod autoscaling scenarios. The scheduling mechanisms of Kubernetes offer considerable potential for improvement. This article introduces a custom scheduler that reduces redundant scoring steps using a caching mechanism, thereby accelerating the scheduling process for horizontal scaling of pods. The article begins with an in-depth literature review, followed by the development of novel algorithms to address existing gaps in the default scheduler. The custom scheduler is then subjected to rigorous simulation and testing phases to ensure its robustness and efficiency. Experimental results demonstrate the effectiveness of the proposed approach in improving the scheduling performance for HPA in Kubernetes.
Zhou et al. (Wed,) studied this question.