Reconfigurable Intelligent Surfaces (RIS) and Non-Orthogonal Multiple Access (NOMA) are considered key enabling technologies for sixth-generation (6G) wireless networks due to their ability to improve spectral efficiency, coverage, and energy efficiency. However, the joint optimization of user clustering and power allocation in RIS-assisted NOMA systems leads to highly complex and non-convex optimization problems, especially in dynamic and large-scale network environments. Recently, Deep Reinforcement Learning (DRL) has emerged as a powerful tool for solving such complex resource allocation problems without relying on accurate mathematical models. In this paper, an in-depth review of existing research on RIS, NOMA, and DRL-based optimization techniques is presented. The literature is analyzed and classified based on system models, optimization objectives, and learning methods. To illustrate the benefits and limitations of current methods, a comparative analysis is given. Finally, open challenges and future research directions are identified, with a focus on the cooperative optimization of power distribution and user clustering for 6G networks.
Kaur et al. (Sun,) studied this question.