Abstract Background: Recent advances in antibody-drug conjugates have demonstrated clinical benefit in patients with HER2-positive breast cancer as well as HER2-low and ultra-low expression tumors. However, prior studies have shown that only a fraction of patients positive by IHC will respond to HER2-targeted therapy. Orthogonal methods to determine HER2 activity such as gene expression have the potential to improve patient selection. We utilized gene expression profiling to build a multi-gene machine learning model to predict HER2 activity, which may aid the identification of patients most likely to benefit from HER2 targeted therapies, particularly for those with HER2-low tumors. Methods: Our cohort comprised 477 patients with breast cancer who underwent RNA-based gene expression profiling of 1,517 genes using FoundationOne®RNA (a laboratory developed test by Foundation Medicine, Inc.). HER2 IHC status was abstracted from pathology reports (IHC 0: 182; 1+: 137; 2+: 128; 3+: 30). Differentially expressed genes (DEGs) between HER2 IHC 0 (label: no activity) and 2+/3+ (label: high activity) groups were identified based on an FDR p-value 0.01 and log2(fold change)1. A random set of 25% of 0 and 2+ samples and 50% of 3+ samples were held out, prior to modeling. Using a 70:30 (training:test) split of the remaining samples, lasso regression was applied on the expression of DEGs (quantified as transcripts per million, TPM) to train the model. Held-out samples were used to assess the proportion of samples predicted to have high HER2 activity across all IHC categories. This was repeated for 50 iterations and summary metrics were calculated across all iterations. Results: A total of 44 DEGs were used to train the model; ERBB2, TTYH1 and GRB7 were some of the top predictors. Running the model on held-out samples across 50 iterations, IHC 3+ had the highest proportion of samples predicted as “high activity” (median proportion 87%, IQR 87%-93%), with decreasing proportions in 2+ (58% 47%-63%), 1+ (41% 35%-44%) and 0 (22% 17%-28%). Notably, HER2 3+ samples with no predicted activity had a significantly lower median ERBB2 expression (4,825 TPM IQR: 4,502-5,167) than high activity samples (34,146 IQR: 21,568-44,604) (P0.0001) and were most frequently ERBB2 amplification negative (OR = 14.2, P 0.0001). Additional performance assessment in an independent validation cohort will be presented. Conclusion: A multi-gene expression signature could identify signaling pathways that may better inform HER2 dependency. The machine learning-based model developed here showed a higher proportion of HER2 activity cases with increasing HER2 IHC staining, with potential utility to identify patients who may benefit from HER2 targeted therapies. Exploration in a clinical cohort is warranted. Citation Format: S. D. Sisoudiya, J. S. Ross, R. B. Keller-Evans, R. S. Huang, A. B. Schrock, G. Frampton, E. M. Ebot, E. S. Sokol, S. Sivakumar, A. M. Kahn, M. Lustberg. Machine learning based inference of real-time HER2 activity from gene expression profiling in breast cancer to inform HER2 targeted therapy selection abstract. In: Proceedings of the San Antonio Breast Cancer Symposium 2025; 2025 Dec 9-12; San Antonio, TX. Philadelphia (PA): AACR; Clin Cancer Res 2026;32(4 Suppl):Abstract nr PS4-03-19.
Building similarity graph...
Analyzing shared references across papers
Loading...
S. D. Sisoudiya
J. S. Ross
R. B. Keller-Evans
Clinical Cancer Research
Yale Cancer Center
Drug Discovery Laboratory (Norway)
Amyotrophic Lateral Sclerosis Therapy Development Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Sisoudiya et al. (Tue,) studied this question.
www.synapsesocial.com/papers/6996a957ecb39a600b3f0501 — DOI: https://doi.org/10.1158/1557-3265.sabcs25-ps4-03-19
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: