VR cloud gaming promises immersive experiences, yet its realization is critically challenged by the trade-off between stringent latency requirements and high visual quality under unpredictable network conditions. Existing heuristic adaptive bitrate and foveation approaches lack adaptability to highly dynamic mobile networks. This results in a suboptimal trade-off between bandwidth usage and visual quality. While data-driven approaches (i.e., reinforcement learning, RL) have been successful in video streaming, their application to VR cloud gaming poses particular challenges. The stringent demands for high resolution and frame rate, and ultra-low latency are compounded by the necessity for fine-grained, per-frame inference to adapt to rapid changes in user gaze and network conditions. This work introduces FovRL, an RL framework for jointly optimizing foveation parameters and bitrate allocation in response to real-time network throughput. Our work pioneers the application of RL for real-time foveated encoding in immersive VR cloud gaming. Evaluations over real-world networks reveal that FovRL enhances bitrate adaptability to deliver superior perceptual visual quality, while maintaining latency comparable to the SoTA.
Tsui et al. (Sun,) studied this question.