An integrated multi-omics machine learning model improved the prediction of COPD exacerbations (mean AUC 0.61±0.03) compared to single-omics transcriptomic or proteomic models.
Cohort (n=3,325)
Does multi-omics integration improve the prediction of COPD exacerbations compared to single-omics models in current and former smokers?
Integrating transcriptomic and proteomic data using a machine learning framework modestly improves the prediction of COPD exacerbations compared to single-omics approaches.
Effect estimate: AUC 0.61±0.03
Abstract Introduction Exacerbations are major drivers of morbidity and mortality in chronic obstructive pulmonary disease (COPD). Predicting which patients are at greatest risk remains challenging, as clinical and spirometric measures do not fully capture the molecular complexity of the disease. Multi-omics integration offers an opportunity to improve prediction and reveal biological pathways underlying exacerbation susceptibility by capturing cross-layer interactions that single-omics analyses often overlook. Methods We applied MOGONET (Multi-Omics Graph cOnvolutional NETwork), a supervised machine learning framework, to predict prospective exacerbation risk among current and former smokers in the COPDGene Phase 2 study. Exacerbation frequency was expressed as an annualized rate (events/year) and dichotomized at ≥ 1 event/year, a data-driven threshold derived from the observed distribution to classify subjects as frequent (≥ 1 event/year) or infrequent ( 1 event/year) exacerbators. We integrated whole-blood transcriptomic profiles (19,249 genes) with plasma proteomic data from SomaScan (4,690 proteins), yielding a final analytic cohort of 3,325 participants. Data were randomly partitioned into training (70%) and testing (30%) subsets, with repeated stratified 5-fold cross-validation to maintain balanced representation of exacerbation groups across folds. MOGONET jointly modeled within-omics relationships and cross-omics interactions for final prediction. For comparison, single-omics models used the same graph convolutional network architecture applied to individual omics layers without cross-layer integration. Model performance, averaged across cross-validation folds, was assessed using the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, and specificity. Model settings were optimized to maximize predictive performance. Feature importance scores were averaged across folds to identify robust predictors. Results The integrated multi-omics model achieved a mean AUC of 0.61±0.03 for predicting COPD exacerbations, outperforming single-omics models based on transcriptomic (0.55±0.03) or proteomic (0.52±0.03) data alone. Mean classification accuracy was 83±3%, with balanced sensitivity and specificity across folds. Feature importance analysis identified LIN7C (epithelial junction regulation), SOD3 (extracellular superoxide dismutase), and IL-6 protein as the top predictors, implicating pathways related to epithelial repair, immune modulation, and neurovascular signaling that may underlie exacerbation susceptibility. Conclusion Integrating transcriptomic and proteomic data improved exacerbation prediction and revealed biologically informative molecular signatures. Future work will replicate these findings in independent cohorts, expand integration to additional omics layers, and use bivariate SHAP (SHapley Additive exPlanations) analysis to enhance model interpretability and uncover interactions among key biomarkers. Together, these efforts may deepen mechanistic understanding and advance precision strategies for monitoring, prevention, and treatment of COPD exacerbations. This abstract is funded by: This work was supported by NHLBI R01 HL167072, R01 HL124233, R01HL166992, R01HL171213, and K01 HL166705. The COPDGene study (NCT00608764) is supported by NIH contract 75N92023D00011 and by the COPD Foundation through contributions made to an Industry Advisory Committee that has included AstraZeneca, Bayer Pharmaceuticals, Boehringer Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer, and Sunovion. Molecular data from the Trans-Omics in Precision Medicine (TOPMed) program was supported by the National Heart, Lung, and Blood Institute (NHLBI).
Madavaprasad et al. (Fri,) conducted a cohort in Chronic obstructive pulmonary disease (COPD) (n=3,325). Integrated multi-omics model (MOGONET) vs. Single-omics models (transcriptomic or proteomic alone) was evaluated on Prediction of prospective COPD exacerbation risk (≥ 1 event/year) (AUC 0.61±0.03). An integrated multi-omics machine learning model improved the prediction of COPD exacerbations (mean AUC 0.61±0.03) compared to single-omics transcriptomic or proteomic models.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: