Lung cancer remains one of the leading causes of cancer-related mortality worldwide, and accurate histopathological classification is essential for timely diagnosis and treatment planning. This study presents a Contrastive Language Image Pretraining (CLIP)-based framework for multiclass lung histopathology classification, designed to distinguish among benign lung tissue, lung adenocarcinoma, and lung squamous cell carcinoma. The proposed approach leverages a pretrained CLIP ViT-B/32 backbone, domain-specific prompt engineering, multimodal image text pairing, and similarity-based classification within a shared embedding space. To strengthen convergence and robustness during fine-tuning, the training pipeline incorporates data augmentation, Focal Loss, AdamW optimization, OneCycle learning rate scheduling, mixed-precision training, gradient clipping, and early stopping. The dataset is organized into separate training, validation, and testing splits, with the reported training and validation partitions containing 3,500 and 500 images per class, respectively. Experimental training on a Tesla T4 GPU demonstrated steady performance improvement across epochs, with the best validation accuracy reaching 95.20%, accompanied by a macro AUC of 0.9870 and a micro AUC of 0.9877, before early stopping was triggered at epoch 23. These findings indicate that integrating CLIP with pathology-specific text prompts provides a strong and reliable framework for automated lung cancer histopathology classification, with promising potential for future intelligent digital pathology systems.
Munawar et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: