Existing power systems are missing efficient feature screening and global real-time collaboration in their multi-scale data processing and dispatching architectures, hindering their safe and stable operation in dynamic environments. This paper addresses adaptive extraction of multi-scale power data and cloud-edge collaborative scheduling based on deep reinforcement learning. We make the explicit division of timescale into milliseconds to seconds, seconds to minutes, and minutes to hours, and spatial scale into measurement point, feeder, and zone levels, such that adaptive extraction can find the necessary trade-off between preserving discriminative information and reducing transmission costs under limited communication bandwidth and edge computing power. Spatiotemporal coupling features are extracted from the original voltage, current, load, and device state sequences using a multi-layer convolutional feature encoding network, and a multi-head attention-based feature screening module dynamically assigns weights to the encoded vector to attend to key state variables. Lightweight policy network optimized with parameter pruning and sparsification is deployed on the edge for low-latency local state assessment and action execution, and the cloud is responsible for network-wide topology modeling and global policy optimization. Grid topology is modeled using a graph neural network to preserve topological invariance, and node coupling relationships are represented via neighborhood message passing. The policy gradient algorithm is used to update policies in continuous high-dimensional action spaces, and the update variance is reduced through value estimation and advantage normalization. A hierarchical parameter synchronization mechanism is used between the cloud and the edge to exchange compressed feature summaries, parameters, and gating thresholds at periodic or event-driven synchronization points, preserving policy convergence and state consistency. The decision flow is a cooperative, closed-loop of short-term actions at the edge and global instructions in the cloud. In the constructed deep reinforcement learning framework, the state includes filtered feature summary, local latency measurements, and cloud parameter vector, and the actions include both discrete feature selection gates and continuous scheduling instructions. The reward is a weighted sum of scheduling deviation, feature reconstruction error, latency, and resource consumption. The weights are calibrated on the validation set to preserve joint optimization of feature extraction and scheduling decisions. The dimensionality of voltage and current signals is reduced from 21 600 to 4800 during feature compression, resulting in a 77.78% compression rate and significantly reducing data transmission pressure. During the edge inference phase, the average latency is reduced from 35 to 12 ms, and memory usage is reduced from 48 to 15 MB, which demonstrates its high efficiency under limited computing power. When the load disturbance increases from 0.5% to 10%, the robustness index remains above 0.82, demonstrating its adaptability to complex operating conditions. This method applies to monitoring and scheduling tasks with the same sampling rate and edge device resources as the experimental platform described in this paper. The process does not guarantee meeting the strict, challenging real-time cutoff requirements when the link latency exceeds 200 ms or the available memory on the edge device is less than 250 MB.
Liang et al. (Thu,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: