February 13, 2025

Hybrid CNN-ViT Architecture for Early Cancer Diagnosis: Advancing Imaging Data Analysis with Explainable AI

MHMd Rokibul HasanSoutheast Missouri State University MSMohammad Balayet Hossain SakilTrine University MHMd Amit HasanUniversity of Connecticut

Key Points

Key points are not available for this paper at this time.

Abstract

Early and accurate diagnosis of cancer is crucial for improving patient outcomes. This study proposes a novel hybrid deep learning architecture integrating Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) to leverage their complementary strengths for advanced imaging data analysis. The proposed model was evaluated on the HAM10000 and Melanoma Skin Cancer datasets, achieving state-of-the-art performance across multiple metrics. On the HAM10000 dataset, the hybrid CNN-ViT model achieved an accuracy of 94.23%, a precision of 93.89%, a recall of 93.56%, an F1-score of 93.72%, and a ROC-AUC of 95.18%. Similarly, on the Melanoma dataset, it demonstrated an accuracy of 93.78%, a precision of 93.45%, a recall of 93.12%, an F1-score of 93.28%, and a ROC-AUC of 94.56%. These results outperform traditional CNNs, such as ResNet-50, and standalone ViTs, showcasing the hybrid model’s capability to ensure both local spatial features and global contextual relationships effectively.

Ask AI

Helpful

Bookmark

View Full Paper