ABSTRACT Wireless sensor networks (WSNs) in smart grids (SGs) face issues like energy depletion and coverage gaps, with nodes near sink draining faster due to higher communication, causing hotspots. In this paper, a reinforcement learning(RL)–based mobility optimization framework for WSNs is developed to balance two conflicting objectives: maximizing network lifetime and ensuring high throughput. Four network strategies (static, multiple, single mobile, and RL‐based mobile sinks) are comparatively evaluated under grid and spiral network topologies. A proposed pareto‐optimal reward strategy directed by the RL agent, with a meta‐heuristic search simultaneously enhancing its convergence and exploration capabilities in complex environments. The model integrates energy consumption as key parameters within a reward‐driven learning process. Simulation results demonstrate that the RL‐based mobile sink outperforms static and deterministic multiple‐sink strategies, achieving extended network lifetime, higher throughput. Regarding network performance, grid topologies outperform spiral structures in maximizing network lifetime, while spiral structures demonstrate an advantage in achieving greater data throughput. The proposed RL‐based approach extends the overall network lifetime from 33.5 months to 36.15 months and maintains high data throughput, outperforming conventional static and multiple‐sink methods.
gomaa et al. (Thu,) studied this question.