Digital video content must be stored and transmitted with efficient video compression, particularly in environments with limited bandwidth, such as mobile communication networks and internet streaming services. The two primary challenges with traditional video compression methods are minimizing latency and maintaining video quality while achieving high compression ratios. This paper presents a novel framework that combines a temporal-context-adaptive entropy model, recurrent auto-encoders (RAE), and recurrent probability models (RPM) to address these issues. The main goal of the framework is to efficiently encode latent representations and record temporal correlations between consecutive video frames in order to achieve high compression rates. The RAE uses Convolutional Long Short-Term Memory (ConvLSTM) cells in its encoder and decoder networks, which leverage temporal dependencies to enhance compression. Furthermore, the suggested model uses Feature Transformation (FT) layers to integrate multi-scale motion information, which enhances compression efficiency. Adaptive arithmetic coding that is contextually informed by the temporal relationships of video frames is made possible by the RPM’s recurrent estimation of parameters of conditional probability mass functions. This technique produces more effective compression by reducing the size of latent representations while maintaining crucial video data. Implemented on the Python platform, the proposed model achievesa Peak Signal-to-Noise Ratio(PSNR) of 48.42, out performing existing compression techniques in both efficiency and quality.
Khadir et al. (Thu,) studied this question.