In indoor environments, the elderly, children, and people with disabilities are prone to abnormal behaviors like violent injuries and falls, threatening their health and safety and requiring real-time technical monitoring. However, existing technologies face issues including target occlusion, heavy background interference, feature ambiguity, posture deformation, and large-scale differences when detecting such behaviors in complex scenarios. To tackle these challenges, the present study puts forward the SCGS-YOLO real-time algorithm by improving YOLOv8s: replacing its backbone with ShuffleNet V2 to improve inference efficiency in resource-constrained environments; integrating CBAM into its backbone and SimAM into the feature fusion module, strengthening key features via dual-stream attention to suppress background interference and alleviate occlusion ambiguity; and using GIoU as the positioning loss function to facilitate model convergence and improve localization precision for complex human postures. Experiments on self-built VD-2025 and FD-2025 datasets show that SCGS-YOLO achieves 98.20% precision, 96.10% recall, and 98.70% mAP@0.5 for violence detection, and 97.10% precision, 93.80% recall, and 96.70% mAP@0.5 for fall detection. Compared to the original YOLOv8s and several representative models, it effectively balances detection precision and real-time response for the aforementioned groups’ indoor abnormal behaviors.
Qin et al. (Wed,) studied this question.