Various TinyML models face a constantly challenging environment when running on emerging sixth-generation (6G) edge networks, with volatile wireless environments, limited computing power, and highly constrained energy use. This paper introduces DRL-TinyEdge, a latency- and energy-sensitive deep reinforcement learning (DRL) platform optimised for the 6G edge of adaptive TinyML. The suggested on-device DRL controller autonomously decides on the execution venue (local, partial, or cloud) and model configuration (depth, quantization, and frequency) in real time to trade off accuracy, latency, and power savings. To assure safety during adaptation to changing conditions, the multi-objective reward will be a combination of p95 latency, per-inference energy, preservation of accuracy and policy stability. The system is tested under two workloads representative of classical applications, including image classification (CIFAR-10) and sensor analytics in an industrial IoT system, on a low-power platform (ESP32, Jetson Nano) connected to a simulated 6G mmWave testbed. Findings indicate uniform improvements, with up to a 28 per cent decrease in p95 latency and a 43 per cent decrease in energy per inference, and with accuracy differences of less than 1 per cent compared to baseline models. DRL-TinyEdge offers better adaptability, stability, and scalability when using a CPU < 5 and a decision latency < 10 ms, compared to Static-Offload, Heuristic-QoS, or TinyNAS/QAT. Code, hyperparameter settings, and measurement programmes will also be published at the time of acceptance to enable reproducibility and open benchmarking.
Alaklabi et al. (Sun,) studied this question.