Identifying small objects in computer vision is still a difficult problem. This is due to a loss of important features, edge disruptions during image processing, and poor performance in different situations. Although modern deep learning methods, especially those using YOLO detectors, can deliver quick predictions, they often have trouble accurately detecting small objects in high-resolution images. Moreover, current explainability methods like Grad-CAM can be expensive to use consistently. To address these issues, this paper presents a Generalised and Explainable Multi-Scale YOLO-SAHI framework. It includes slicing-aided inference, adaptive feature improvement, and selective explainability. By combining YOLO-based detection with the SAHI approach, the system uses overlap-based slicing to better locate tiny items and reduce boundary loss. It also employs Contrast Limited Adaptive Histogram Equalisation (CLAHE) to improve visibility in low contrast and poor lighting. A key feature of this study is the introduction of an uncertainty-triggered explainability mechanism. In this mechanism, Grad-CAM is applied only to predictions with low confidence, which cuts down on computational costs while still being interpretable. The proposed model is evaluated across various fields, such as industrial fault detection, agricultural disease detection, and medical image analysis, to ensure it works well in different situations. Expected outcomes include improved mean Average Precision (mAP) for small object detection, better performance in challenging visual contexts, and effective interpretability with reduced overhead. By offering a scalable, efficient, and understandable framework for recognizing small objects in real-world settings, this study tackles significant gaps in computational efficiency, accuracy, and general applicability.
UPPALA VIJAY KUMAR (Mon,) studied this question.