Key points are not available for this paper at this time.
Deep learning-based video frame interpolation (DVFI) can generate high frame-rate video sequences with high temporal consistency, usually producing visually plausible results. As DVFI can also be deployed for vicious video operations, it has misled users' visits and invalidates near-duplicate video detection. Therefore, it is urgent to locate the interpolated frames subjected to DVFI techniques. This letter investigates this issue by exploiting exceptional regions-aware localization (ERaL). In particular, we guide ERaL with an “inverted Z-shaped” network, which can better capture the position and intensity of exceptional regions regardless of the specific DVFI method, coming from the fact that the faked frame rate videos collapse even if any DVFI methods generate them, as ERaL only learns over original videos. Then, a hierarchical feature extraction is developed, integrating the feature enhancement, simplified transformer, and inverted residual feed-forward network, to produce a frame-wise localization of the interpolated frames for a given sequence. The proposed method is evaluated with counterfeited videos manipulated by three state-of-the-art DVFI approaches. Extensive experimental results demonstrate that the proposed method can effectively localize the interpolated frames, surpassing existing algorithms.
Ding et al. (Mon,) studied this question.