What question did this study set out to answer?

The aim is to address challenges in maintaining Quality of Service (QoS) in heterogeneous and dynamic edge-to-cloud Kubernetes deployments.

February 28, 2026Open Access

QAFHE: A QoS-Aware Framework for Heterogeneous and Dynamic Edge-to-Cloud Kubernetes Deployments

Key Points

The aim is to address challenges in maintaining Quality of Service (QoS) in heterogeneous and dynamic edge-to-cloud Kubernetes deployments.
Introduced Qafhe, a framework integrated with Kubernetes for QoS-aware scheduling.
Conducted experiments deploying inference servers across various nodes.
Assessed performance metrics, focusing on response times and application-specific results.
Showed up to 5✗ improvement in response times in dynamic scenarios.
Demonstrated enhanced performance in environments with multi-core CPUs and diverse GPU types.

Abstract

The introduction of the computing continuum paradigm introduces new challenges due to the heterogeneity of computing entities in deployments. These challenges primarily affect the ability to maintain appropriate Quality of Service (QoS) and Service-Level Agreement (SLA) values when assigning workloads and requests to nodes, particularly concerning response times, as well as application-specific metrics. Moreover, in scenarios where computing elements are inherently dynamic in terms of availability, computing power or latency, efficiently assigning workloads to the most appropriate computing element becomes an even more significant challenge. Current generic orchestrators, like Kubernetes, have shown themselves to be effective in homogeneous and static environments, where the usual QoS-unaware scheduling strategies focus mainly on load balancing, neglecting aspects such as reducing latency or constraining application-level metrics. In this study, we reveal that generic orchestrators like Kubernetes fall short when QoS-agnostic policies are applied to heterogeneous and dynamic edge-to-cloud environments. We introduce Qafhe, a novel framework that integrates effortlessly into Kubernetes. This framework is designed with a set of QoS-aware scheduling policies to effectively address the heterogeneity and dynamicity found in numerous edge-to-cloud setups. Our experiments, specifically involving the deployment of inference servers across diverse nodes, show up to 5✗ improvement in response times across various dynamic scenarios involving devices with heterogeneous compute capabilities, such as multi-core CPUs and diverse GPU types.

Bookmark

View Full Paper

Bookmark

View Full Paper

QAFHE: A QoS-Aware Framework for Heterogeneous and Dynamic Edge-to-Cloud Kubernetes Deployments

Key Points

Abstract

Cite This Study