In edge heterogeneous computing environments, machine learning technologies have been widely applied in pattern recognition tasks such as image classification. However, traditional centralized frameworks face dual challenges of exorbitant data transmission costs and critical data privacy vulnerabilities. Bandwidth constraints of edge devices further exacerbate network congestion and latency issues in transmitting video and image data. Moreover, federated learning needs to address the “straggler effect” caused by data, computational, and communication heterogeneity in practical deployment, leading to low efficiency of global model training. To this end, this paper proposes a Chronos adaptive scheduling mechanism based on Long Short-Term Memory (LSTM). By real-time predicting device resource capabilities, it dynamically adjusts the training batch size and task frequency of each edge device. This mechanism collaboratively schedules computational and communication resources to balance the training load of heterogeneous devices, preventing high-performance devices from being bottlenecked by low-performance ones while ensuring no model staleness. Experimental results demonstrate that Chronos achieves an accuracy improvement of 0.51% on the MNIST dataset and 3.76% on the more complex CIFAR-10 dataset (with a maximum of 61.56% top-1 accuracy), and a 3.12×-6.4× training speedup compared to baseline frameworks (BSP, SSP, FedBuff), while reducing the average synchronization waiting time (ASWT) by 31.94%-64.64% in heterogeneous environments.
Yanhe Shen (Thu,) studied this question.