ABSTRACT Background The fast growth of Internet of Things (IoT) ecosystems has created huge amounts of heterogeneous big data, imposing significant load on latency of network, energy usage of devices, and computation resources. The conventional cloud‐centric architecture cannot support the high responsiveness needed by the current IoT applications owing to the high transmission latency and resource coordination inefficiency. Objective To overcome such constraints, this paper suggests a smart task offloading system that relies on a Deep Reinforcement Learning (DRL) system, which is implemented on an Advantage Actor‐Critic (A2C) framework. Methods The offloading issue is modelled as a Markov Decision Process (MDP) and state variables that describe the characteristics of the tasks and the device battery, the channel and server usage. The A2C agent learns dynamically the best policies of splitting computation among device and edge layers, fog and cloud layers and jointly optimizes the latency, energy usage, and resource utilization. The system is tested with the help of a realistic simulation environment, including heterogeneous IoT devices, mobility, and stochastic wireless conditions. Results Experimental outcomes show that the proposed framework can reduce the average latency by up to 76 percent, and the energy consumption of the device level by 74 percent in comparison with the baseline strategies, such as, local execution, greedy offloading and DQN‐based learning. The framework also enhances task completion rates on time constraints and the workload allocation across computing layers, which highly increase scalability, a robust system, and overall Quality of Service (QoS) in the large‐scale deployment of IoT application. Conclusion The findings indicate the promise of the use of actor‐critic DRL to assist the next‐generation IoT applications across different fields including smart cities, healthcare monitoring, and industrial automation.
Btoush et al. (Thu,) studied this question.