Early and reliable diagnosis of skin cancer from dermoscopic images remains challenging due to class imbalance, subtle inter-class variations, lesion boundary ambiguity, and illumination inconsistency, which can degrade the robustness of conventional convolutional neural networks (CNNs). To address these limitations, this study proposes an automated smart healthcare framework for dermoscopic skin cancer diagnosis using an Enhanced Vision Transformer (E-ViT) that improves global-context modeling through self-attention while strengthening fine-grained lesion representation learning. Unlike standard ViT configurations, the proposed architecture integrates multi-scale patch embedding and attention refinement to better capture border irregularities and color–texture heterogeneity that are critical for melanoma discrimination. Furthermore, to eliminate manual hyperparameter tuning and ensure stable convergence across imbalanced datasets, we introduce a Dimension Learning–Based Hunting (DLH) Improved Grey Wolf Optimization (IGWO) strategy as the sole hyperparameter optimization engine. DLH-IGWO adaptively balances exploration and exploitation by updating search directions across dimensions, enabling efficient discovery of optimal training settings (e.g., learning rate, batch size, patch size, transformer depth, attention dropout, and weight decay) under constrained computational budgets. Experimental results on the HAM10000 and ISIC-2019 datasets show that the proposed DLH-IGWO–E-ViT consistently outperforms strong CNN and ViT baselines, achieving 98.37% accuracy, 97.78% F1-score, and 99.12% AUC on HAM10000 and 97.21% accuracy with 95.93% macro-F1 on ISIC-2019. The model significantly improves melanoma detection by reducing the false-negative rate to 3.16%, with all gains being statistically significant (p < 0.001) and computationally feasible for clinical deployment.
Abugabah et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: