What question did this study set out to answer?

This research aims to enhance energy efficiency in cell-free massive MIMO networks by optimizing access point activation using a reinforcement learning approach.

March 18, 2026Open Access

Energy-Efficient Access Point Switch On/Off in Cell-Free Massive MIMO Using Proximal Policy Optimization

Puntos clave

This research aims to enhance energy efficiency in cell-free massive MIMO networks by optimizing access point activation using a reinforcement learning approach.
Utilized a proximal policy optimization framework for access point switch on/off management.
Evaluated performance across three different scenarios of increasing network size and complexity.
Applied a simulation framework using large-scale fading statistics and power parameters without relying on instantaneous channel information.
Achieved energy efficiency improvements of up to 66% compared to random activation.
Outperformed an all-on baseline by nearly 50% in terms of energy consumption.
Demonstrated robust performance scalability as the network increased in size.

Resumen

The increasing densification of cell-free massive multiple-input multiple-output (MIMO) networks makes access point switch on/off (ASO) a key mechanism for improving energy efficiency in future wireless systems. While reinforcement learning (RL) has been explored for ASO, differences in modeling assumptions and evaluation scope leave open questions regarding robustness and scalability. In this work, ASO is investigated from an explicit energy-efficiency perspective using a RL framework based on Proximal Policy Optimization (PPO). The policy learns state-dependent AP activation under partial observability using compact per-access point (AP) large-scale fading statistics and power parameters, without requiring instantaneous small-scale channel state information or combinatorial search, enabling practical online implementation. A comprehensive evaluation is conducted under a unified and reproducible simulation framework across three cell-free deployment scenarios of increasing size that preserve AP density while incorporating realistic channel and power consumption models. Performance is assessed through both average and distribution-based metrics. Numerical results show that the PPO-based policy consistently outperforms random activation and the all-on baseline, achieving energy-efficiency improvements of up to 66% and nearly 50%, respectively, while activating a comparable number of APs. Moreover, the learned policy maintains robust performance as the network scales, reducing the likelihood of highly energy-inefficient operating regimes.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo