Video streaming has emerged as a widely used Internet service, in which adaptive bitrate (ABR) algorithms play a critical role in delivering high quality of experience (QoE). However, existing learning-based ABR methods often suffer from limited generalization in unseen and dynamically changing network conditions. Although some meta-reinforcement learning techniques have been proposed to mitigate this issue, they generally depend on additional online training or fine-tuning. To overcome these limitations, this paper introduces EAStream, an environment-aware ABR algorithm based on meta-reinforcement learning for reliable video streaming services. The method employs a variational autoencoder to extract a latent representation of the current network environment from historical interaction data. This latent variable, along with the current system state, is fed into a policy network that perceives network conditions in real time and adapts bitrate decisions accordingly, without requiring further online training. A comprehensive evaluation is conducted using diverse real-world network traces. Experimental results show that EAStream not only achieves leading performance on in-distribution test sets compared to state-of-the-art ABR algorithms, but also demonstrates superior generalization capability on out-of-distribution test scenarios.
Huang et al. (Mon,) studied this question.