Los puntos clave no están disponibles para este artículo en este momento.
Wireless sensor networks (WSNs) deployed for seismic monitoring must sustain long-term operation under strict energy constraints, where premature node failure degrades spatial coverage and detection reliability. This paper presents a safety-constrained reinforcement learning framework for transmission scheduling in energy-harvesting seismic WSNs. The proposed approach integrates Proximal Policy Optimisation (PPO) with action masking and a runtime guard-layer safety filter that enforces battery-preservation and load-balancing constraints without retraining. The guard layer intercepts policy actions and substitutes safe alternatives when constraint violations are detected, using a scoring function that combines battery headroom with network-wide load equity. Experiments across three network scales (10, 15, and 30 nodes) with solar energy harvesting demonstrate that the guard-enhanced PPO achieves 99.46% transmission success at 30 nodes while maintaining 66.47% node survival—a 58.3% improvement in survival over the highest-reward baseline (Closest) at the cost of only a 6.2% reduction in cumulative reward. Crucially, the guard-enhanced policy outperforms the unconstrained PPO baseline simultaneously on cumulative reward (+11.4%), transmission success (+0.8 pp), and node survival (+15.4%), demonstrating that hard safety constraints, when properly aligned with the system’s energy model, provide both performance and safety gains rather than a fundamental trade-off. Sensitivity analysis across event rates (pevent=0.5 and 0.9) confirms that the guard layer’s advantage persists under both moderate and extreme monitoring conditions. Analysis across scales reveals distinct operational regimes: at 10 nodes, heuristic baselines are near-optimal; at 30 nodes, learned policies dominate, and safety filtering becomes critical for sustained operation.
Nazamdin et al. (Wed,) studied this question.