In complex industrial settings, challenges such as uneven lighting, occlusion interference, and significant variations in target scale persist. This paper proposes an efficient convolutional neural network model that integrates dynamic attention with multi-scale feature enhancement. The model employs ResNet-50 as its backbone architecture, incorporating a Dynamic Attention Module (DAM) that adaptively integrates channel and spatial attention to enhance focus on critical regions. A Multi-Scale Feature Fusion Module (MFFM) is designed to establish cross-level feature interaction pathways, significantly improving detection capabilities for small and deformable objects. Additionally, deep separable convolutions and channel pruning strategies are introduced to achieve model lightweighting. Experimental results on NEU-DET, DAGM 2007, and a self-built assembly line dataset demonstrate that the proposed method achieves mAP of 83.6%, 95.3%, and 86.9% respectively, while maintaining a real-time detection speed of 41 FPS. This outperforms mainstream methods such as YOLOv5s and EfficientDet, validating its high accuracy, strong robustness, and real-time capability in complex industrial environments.
Rongzhen Zhang (Sun,) studied this question.