This paper presents an advanced pipeline for detecting fast-moving objects in aerial surveillance videos captured by unmanned aerial vehicles (UAVs). The proposed system addresses the challenges posed by large frame intervals and motion blur by integrating two key preprocessing techniques: Real-Time Intermediate Flow Estimation(RIFE)-based temporal frame interpolation, which increases video frame rates from 30 frames per second(FPS) to 120 FPS, and Multi-Input Multi-Output U-Net Plus (MIMO U-NetPlus)-based deblurring, which enhances spatial clarity. These enhancements are combined with the YOLOv8n object detection model, improving detection accuracy without altering its core architecture. Experimental evaluations on the VisDrone2019-VID dataset demonstrate that the proposed method significantly outperforms the baseline YOLOv8n, achieving a mAP@0.5 of 0.968 and a mAP@0.5:0.95 of 0.881. The results confirm that the combination of temporal interpolation and deblurring effectively restores object continuity and sharpness, leading to substantial improvements in detection performance for drone-based monitoring applications. Although the proposed pipeline introduces considerable latency (1.23 FPS vs. 108.89 FPS for the baseline), this computational cost represents a justified trade-off for achieving the high accuracy and robustness essential for mission-critical surveillance tasks where baseline methods fail.
Kim et al. (Thu,) studied this question.