Skin cancer is more common and can be fatal if not diagnosed and treated promptly. Automated skin lesion classification based on dermoscopic images has attracted significant attention, especially with the rapid rise of deep learning-based methods. Yet, it still suffers from limitations, including insufficient multi-scale representation, a lack of global context modelling, and less effective feature recalibration, which undermine classification reliability and robustness. Furthermore, the lack of transparency associated with many deep learning models erodes trust and acceptance among healthcare providers. To tackle these issues, we propose a new deep learning framework that combines multi-scale convolutional feature extraction with lightweight Transformer encoders, while incorporating squeeze-and-excitation (SE) blocks to improve channel-wise attention. By exploiting spatial granularity, contextual richness, and adaptive feature recalibration across latent class-specific patterns in dermoscopic images, the proposed model classifies images from the HAM10000 dataset into seven skin lesion categories. It uses a multi-stage architecture to embed both local and global patterns, and Grad-CAM as a post-hoc explainability method to promote interpretability. We conduct experimental evaluations on the publicly available HAM10000 dataset, and the results indicate that the proposed strategy achieves competitive performance with the state-of-the-art methods, reaching 94.8% overall accuracy, 91.9% macro-average F1-score, and 0.957 AUC. Class-wise performance analysis and ROC curves demonstrate robustness across a range of lesion categories, while ablation experiments verify the individual contributions of each architectural component. In conclusion, our framework outlines a possible computational approach that combines interpretation with noise resilience to refine automated classification of skin lesions. However, further prospective, multi-institutional validation will be necessary before it can have implications for future clinical decision support systems.
Murali et al. (Thu,) studied this question.