Los puntos clave no están disponibles para este artículo en este momento.
In the Internet of Things (IoT) environment consisting of various devices the arrival rate of data packets dynamically changes. Failure to process them in complying with the QoS requirement can significantly degrade the reliability and quality of the system. Therefore, the gateway collecting the data needs to quickly establish a new scheduling policy according to the changing traffic condition. The existing packet scheduling schemes are not effective for IoT since the data transmission pattern is not defined in advance. Q-learning is a type of reinforcement learning that can establish a dynamic scheduling policy according to the state of each queue without any prior knowledge on the network status. In this paper a novel Q-learning scheme is proposed which updates the Q-table and reward table based on the condition of the queues in the gateway and adjusts the reward value according to the time slot. Computer simulation reveals that the proposed scheme significantly reduces the scheduling time while allowing high accuracy compared to the existing Q-learning scheme based on Stochastic Learning Automaton (SLA).
Kim et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: