What question did this study set out to answer?

The aim is to improve video representation using a new method that balances spatial and temporal data efficiently.

February 12, 2026

Spatio-Temporal Spectra-Preserving Neural Representation for Video Modeling

Key Points

The aim is to improve video representation using a new method that balances spatial and temporal data efficiently.
Introduced SNeRV+ for spatiotemporal video representation.
Utilized neural tangent kernel analysis for enhanced learning.
Applied a two-level processing approach for low and high-frequency components.
Decomposed frames using a three-dimensional discrete wavelet transform.
SNeRV+ outperforms existing implicit neural representation methods.
Achieved better performance in video regression, interpolation, extrapolation, and compression.
Demonstrated superiority across both quantitative and qualitative metrics.

Abstract

Green learning (GL) promotes sustainability in deep learning by emphasizing energy-efficient solutions and lightweight models. Implicit neural representations (INRs) for videos provide a compact and efficient approach to video representation within this paradigm. This study introduces SNeRV+, a spatiotemporal, spectra-preserving neural representation for a video that employs neural tangent kernel (NTK) analysis to enhance learning. To mitigate spectral bias in both spatial and temporal domains, SNeRV+ employs a two-level processing approach, where separate encoder branches handle low-frequency (LF) and high-frequency (HF) components. A three-dimensional discrete wavelet transform decomposes each frame into its temporal variations, encoding LF and HF components into frame-wise embeddings. LF components, which capture static scenes and steady motion, are decoded with fixed parameters across frames, reducing temporal discrepancies and mitigating spectral bias. HF components, which encode time-varying details, are dynamically reconstructed using temporally adaptive weights that leverage LF-related parameters as prior information. This design enables a more efficient and compact representation of temporal variations. Experimental results demonstrate that SNeRV+ outperforms state-of-the-art INR-based methods in video regression, interpolation, extrapolation, and compression, achieving superior performance across both quantitative and qualitative evaluation metrics.

Bookmark

Cite This Study

Kim et al. (Tue,) studied this question.

synapsesocial.com/papers/698d6dae5be6419ac0d52cdc https://doi.org/https://doi.org/10.1145/3796711

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark