What question did this study set out to answer?

The aim is to develop a scheduling framework that effectively minimizes latency for edge AI inference workloads using structure-aware techniques.

March 21, 2026Open Access

Structure-Aware Deep Reinforcement Learning for Latency-Minimal Scheduling of Edge AI Inference on Heterogeneous Cores

Key Points

The aim is to develop a scheduling framework that effectively minimizes latency for edge AI inference workloads using structure-aware techniques.
Proposed a structure-aware scheduling framework using Graph Neural Networks and Deep Reinforcement Learning.
Leveraged a Graph Isomorphism Network to extract high-dimensional embeddings of task dependencies.
Conducted experiments on real-world Deep Neural Network workloads, including ResNet and Inception architectures.
The proposed approach outperforms classic heuristics like HEFT in reducing makespan.
Achieved more robust convergence compared to structure-agnostic learning methods.
Demonstrated effective handling of NP-hard scheduling problems in edge AI systems.

Abstract

The exponential surge in latency-sensitive deep learning workloads at the network edge demands the implementation of highly stringent resource scheduling mechanisms to mitigate this situation. Yet, achieving this is impeded by the intrinsic computational asymmetry of contemporary edge processors, which imposes severe constraints on task scheduling.While heuristic algorithms have traditionally been employed to manage these tasks which are intrinsically modeled as Directed Acyclic Graphs they frequently stagnate at local optima due to an inability to dynamically adapt to complex, non-linear topological dependencies. Furthermore, existing learning-based solutions often fail to generalize across diverse neural network architectures, as they typically rely on scalar feature inputs that cannot capture the rich structural priors of the computation graph. To bridge this gap, this paper proposes a structure-aware scheduling framework that synergizes Graph Neural Networks with Deep Reinforcement Learning. By leveraging a Graph Isomorphism Network to extract high-dimensional topological embeddings of task dependencies, our agent learns to map computational layers to heterogeneous cores with the primary objective of minimizing end-to-end inference latency. Extensive experiments on real-world Deep Neural Network workloads, including ResNet and Inception architectures, demonstrate that the proposed approach not only surpasses classic heuristics such as HEFT in terms of makespan reduction but also achieves more robust convergence compared to structure-agnostic learning baselines. These findings suggest that explicitly encoding structural priors into learning algorithms offers a promising approach to effectively address the NP-hard scheduling problems inherent in next-generation edge artificial intelligence systems.

Bookmark

View Full Paper

Bookmark

View Full Paper

Structure-Aware Deep Reinforcement Learning for Latency-Minimal Scheduling of Edge AI Inference on Heterogeneous Cores

Key Points

Abstract

Cite This Study