What question did this study set out to answer?

March 25, 2026Open Access

Key information detection of video data based on improved 3D Dense Net algorithm

Key Points

The primary aim is to improve the accuracy and robustness of key information detection in videos using an advanced deep learning algorithm.
Developed an improved 3D Dense Net algorithm.
Introduced P3D module to reduce computational complexity.
Integrated channel-spatial and time dual-attention mechanisms.
Employed self-distillation and cross-modal attention for better feature integration.
Achieved 95.4% accuracy on UCF-Crime and 93.1% on Surveillance Fight datasets.
Reported error rates of only 0.18 and 0.19, respectively.
Achieved AUC values of 91.8% and 96.0%.
Improved accuracy by 25% after implementing P3D and attention mechanisms.

Abstract

Efficiently and accurately extracting key information from videos has become a core challenge in the current computer vision. Traditional methods rely on manual features and shallow models, making it difficult to capture complex spatiotemporal dynamics, while existing deep learning schemes still have shortcomings in detection accuracy and precision. To optimize the accuracy and robustness of key information detection in videos, an improved 3D Dense Net is proposed. This method introduces the P3D module to decompose spatiotemporal convolution to reduce computational complexity, integrates the channel-spatial and the time dual-attention mechanism to enhance feature expression, and combines self-distillation structure and cross-modal attention mechanism to effectively integrate visual and auditory information. The accuracy of the proposed method reached 95.4% and 93.1% on the UCF-Crime and Surveillance Fight datasets, which was significantly higher than that of traditional models. The proposed method had the lowest error of only 0.18 and 0.19, and the highest AUC values reached 91.8% and 96.0%. Moreover, after introducing P3D and attention mechanism, the accuracy of the proposed method was improved by 25%. The method has improved the accuracy of key information detection in videos, providing a new solution for multi-modal video understanding in complex scenes.

Bookmark

View Full Paper

Bookmark

View Full Paper

Key information detection of video data based on improved 3D Dense Net algorithm

Key Points

Abstract

Cite This Study