To address the challenges of complex backgrounds, character scale variation (especially the expansion of small characters into large sizes due to bone cracks), and low contrast in oracle bone inscription detection, this paper proposes a detection model that integrates frequency-domain attention and multi-scale optimization. The model extracts frequency-domain features using wavelet transform and designs an enhanced shuffle attention module for adaptive fusion of frequency-domain and spatial-domain features, suppressing crack texture interference at character boundaries. The C3 module is improved by incorporating a feature pyramid structure, strengthening multi-scale feature representation for scale-distorted characters. A dedicated large-object detection head is designed with large receptive field convolutions and context-aware mechanisms to optimize detection of “pseudo-large characters” resulting from crack-induced morphological expansion. Experiments on the OBIMD dataset show that the model achieves 51.9% mAP@0.5, a 4.9% improvement over the baseline YOLOv8s, with significant advantages in detecting scale-varied characters due to cracks.
Liu et al. (Tue,) studied this question.