What does this research mean for the field?

A hierarchical machine learning architecture integrating K-means clustering with Q-learning significantly improves bandwidth optimization, signal quality, and scalability in ultra-dense 6G millimeter-wave networks compared to conventional 5G methods. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to improve resource management in 6G millimeter-wave networks using advanced machine learning techniques to optimize bandwidth allocation.

May 20, 2026Open Access

Adaptive resource management in 6G mmWave networks using K-means clustering and Q-learning for efficient bandwidth optimization

Key Points

This research aims to improve resource management in 6G millimeter-wave networks using advanced machine learning techniques to optimize bandwidth allocation.
Developed a hierarchical machine learning architecture integrating K-means clustering with Q-learning.
Performed extensive simulations using NS-3 with NYUSIM channel models for optimization.
Applied a two-stage framework to adapt dynamically to user mobility and channel variations.
Achieved 150 Mbps mean bandwidth optimization, a 99% improvement over conventional 5G methods.
Increased effective signal quality with SINR improving from 52 dB to 60 dB.
Enhanced network capacity and spectral efficiency while managing up to 100 devices/km².

Abstract

Abstract This paper proposes a novel hierarchical machine learning architecture for scalable resource management in 6G millimeter-wave networks, focusing on overcoming the curse of dimensionality in reinforcement learning for ultra-dense deployments. This work achieves a synergistic integration of K-means clustering with Q-learning and develop a two-stage optimization framework that adapts dynamically to user mobility and variations in channels. Unlike the existing monolithic or multi-agent RL approaches, first groups with similar-bandwidth demand users according to channel conditions by applying K-means, reducing the state-action space from U users to K clusters (K < < U). Then, dedicated Q-learning agents per cluster perform joint channel selection, bandwidth allocation, power control, and beamwidth adaptation under a unified multi-objective reward function. Extensive simulations using NS-3 are performed with NYUSIM channel models and demonstrate exceptional performance gains: achieving 150 Mbps mean bandwidth optimization 99% improvement over conventional 5G methods-while maintaining robust scalability, where improve the effective signal quality by intelligently managing interference, resulting in a higher Signal-to-Interference-plus-Noise Ratio (SINR) and more efficient bandwidth utilization improves from 52 dB to 60 dB while increasing the device density from 10 to 100 devices/km². This framework provides a practical solution to the scalability-efficiency tradeoff in RL-based wireless resource management and presents considerable enhancements in network capacity, spectral efficiency, and adaptive channel allocation for next-generation 6G systems.

Bookmark

View Full Paper