Diagnosis and prognosis of lung cancer via PET/CT imaging have long been major clinical concerns. However, existing multimodal approaches often focus on feature aggregation rather than cross-modal interactive collaboration, failing to capture the structural-metabolic correlations and multi-scale synergy essential for characterizing complex lesions. Therefore, this study proposes TriFuse-Net, a tri-branch PET/CT fusion pyramid network (FPN) enhanced by lesion-guided structural-metabolic attention (LSMA) to improve both diagnosis and prognosis prediction tasks. The model is composed of two identical unimodal branches (PET/CT) and one pyramid branch with an interacting channel and spatial attention. The pyramid structure enables bidirectional multiscale feature extraction and fusion, capturing both local details and global semantic information of lesions. Comprehensive experiments validated the model's superiority across three clinical tasks. TriFuse-Net achieved a C-index of 0.747 for progression-free survival (PFS) prediction, showing improvements of 14.7% and 11.0% over ResNet-CT and ResNet-PET, respectively. Additionally, the clinical-integrated model (TriFuse-Net-Cli) achieved AUCs of 0.947 for differentiating lung cancer from tuberculosis and 0.937 for identifying lymph-node metastasis. Ablation studies further confirmed the essential contributions of both FPN and LSMA. In summary, the proposed framework demonstrates that integrating multi-scale structural-metabolic relationships significantly enhances diagnosis and prognosis in lung cancer.
Liu et al. (Thu,) studied this question.