BACKGROUND: Accurate lung cancer subtyping from CT images is essential for treatment planning. However, manual interpretation suffers from inter-observer variability across morphologically similar subtypes. METHODS: To overcome this limitation, we propose a dual-branch cross-attention fusion network integrating ConvNeXt-Small and Swin Transformer-Small. Specifically, this architecture captures both local textures and global structural representations. A learnable cross-attention module then fuses these streams into a 512-dimensional unified descriptor. We train our network on a four-class dataset. RESULTS: It achieves 98.46% accuracy and 98.45% F1-score on the test set. Ultimately, our method significantly outperforms state-of-the-art baselines. CONCLUSIONS: The proposed framework demonstrates massive clinical potential for non-invasive subtyping, offering a robust tool for personalised treatment planning while reducing diagnostic subjectivity.
Liu et al. (Wed,) studied this question.