Abstract This paper proposes a novel hierarchical machine learning architecture for scalable resource management in 6G millimeter-wave networks, focusing on overcoming the curse of dimensionality in reinforcement learning for ultra-dense deployments. This work achieves a synergistic integration of K-means clustering with Q-learning and develop a two-stage optimization framework that adapts dynamically to user mobility and variations in channels. Unlike the existing monolithic or multi-agent RL approaches, first groups with similar-bandwidth demand users according to channel conditions by applying K-means, reducing the state-action space from U users to K clusters (K < < U). Then, dedicated Q-learning agents per cluster perform joint channel selection, bandwidth allocation, power control, and beamwidth adaptation under a unified multi-objective reward function. Extensive simulations using NS-3 are performed with NYUSIM channel models and demonstrate exceptional performance gains: achieving 150 Mbps mean bandwidth optimization 99% improvement over conventional 5G methods-while maintaining robust scalability, where improve the effective signal quality by intelligently managing interference, resulting in a higher Signal-to-Interference-plus-Noise Ratio (SINR) and more efficient bandwidth utilization improves from 52 dB to 60 dB while increasing the device density from 10 to 100 devices/km². This framework provides a practical solution to the scalability-efficiency tradeoff in RL-based wireless resource management and presents considerable enhancements in network capacity, spectral efficiency, and adaptive channel allocation for next-generation 6G systems.
Kulkarni et al. (Mon,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: