• Developed a CNN-Transformer dual-branch network for multi-modal tobacco defect detection. • Designed an adaptive fusion mechanism to enhance cross-modal feature consistency. • Built a dynamic detection framework to improve small lesion localization. • Introduced a multi-scale loss function to optimize learning for small targets.
Zhang et al. (Thu,) studied this question.