Lung cancer remains an important cause of cancer-related mortality worldwide due to late-stage diagnosis and subtle early lesions in chest CT scans, where manual interpretation is labor-intensive and prone to errors. This system proves an efficient Hybrid ViT-Mini + CNN framework that synergizes 3D convolutional local feature extraction with transformer-based self-attention for global contextual modeling across CT slices, enabling precise lung nodule segmentation and malignancy classification. Evaluated on the LIDC-IDRI dataset using a single NVIDIA T4 GPU with mixed-precision training, composite Dice + cross-entropy loss, and OneCycle scheduling, the proposed model achieves superior performance—86.8% Dice score, 88.9% sensitivity, 89.3% classification accuracy, and 0.93 AU. Key contributions include volumetric 3D self-attention for enhanced interpretability of low-contrast nodules, lightweight hybrid fusion for clinical deployability, and a unified dual-task pipeline advancing computer-aided diagnosis systems for early lung cancer screening.
Saraswathi et al. (Thu,) studied this question.