What question did this study set out to answer?

The aim is to validate a gene pair-based machine learning model for predicting overall survival in metastatic breast cancer patients.

April 5, 2026

Abstract 67: Validation of a gene pair-based machine learning model for treatment prioritization in breast cancer.

Key Result

A gene pair-based machine learning model for breast cancer predicted 5-year overall survival within 8% of observed outcomes, explaining 72% of survival variance by treatment category.

Key Points

The aim is to validate a gene pair-based machine learning model for predicting overall survival in metastatic breast cancer patients.
Analyzed 30 metastatic breast cancer patients with next-generation sequencing-directed therapy.
Generated logits scores to represent the probability of death over 5 years across 8 treatment categories.
Used a decoder model to compute mean inverse logits for predicted overall survival rates.
Employed statistical tests like repeated-measures ANOVA and Friedman tests to analyze data.
Predicted overall survival rate was 70.0%, compared to an observed rate of 62.9%.
A significant difference was found between predicted and actual outcomes (Z=6.82, p<0.001).
72% of variance in predicted overall survival is explained by treatment category differences.

Structured PICO

Does a gene pair-based machine learning model accurately predict 5-year overall survival in metastatic breast cancer patients receiving NGS-directed therapy?

Population

30 metastatic breast cancer patients who received next-generation sequencing (NGS)-directed therapy with FoundationOne Companion Diagnostic (CDx) profiling

Intervention

Gene pair-based machine learning model for treatment prioritization based on 5-year predicted overall survival

Comparator

Actual observed outcomes

Outcome

5-year predicted overall survival (POS)

A gene pair-based machine learning model demonstrated strong internal consistency and predicted 5-year overall survival within 8% of observed outcomes in metastatic breast cancer patients.

Main Result

Absolute Event Rate: 0% vs 0%

Abstract

Abstract Introduction: Drug resistance in breast cancer (BC) can arise from synergistic genetic alterations. We previously developed a machine learning (ML) model that ranks treatment categories by 5-year predicted overall survival (POS) based on altered gene pairs. Here, we perform the first retrospective validation of this model using an independent BC patient cohort that received next-generation sequencing (NGS)-directed therapy. Methods: Thirty metastatic BC patients who received NGS-directed therapy with FoundationOne Companion Diagnostic (CDx) profiling were analyzed (PMID: 34572791). For each patient, logits scores (representing the probability of death at 5 years) were generated across 8 treatment categories combined with altered gene pair combinations from the CDx gene-set. Inverse logits, representing POS, were computed using a virtual clinical trial where a decoder model received patient data and output probability distributions for each gene pair. Synthetic patient populations were then randomly generated from these probabilities and used to calculate POS rates for each treatment category per patient. Mean inverse logits were analyzed per patient using repeated-measures ANOVA, Friedman tests and pairwise Holm t-tests. The Shapiro-Wilk test and η2 values assessed model normality and strength of association. Results: Across all patients, the POS rate was 70.0%, compared to the actual rate of 62.9%. A one-sample z-test indicated a significant difference (Z=6.82, p0.001), although the model’s predictions were directionally aligned with actual outcomes. Shapiro-Wilk testing indicated that 79.2% of treatment categories across all patients had W0.9, suggesting an approximately normal distribution of logits. Repeated-measures ANOVA confirmed significant treatment-dependent differences in logits for all 30 patients (p0.001, η2 avg=0.72). This indicates that 72% of the variance in POS is explained by the treatment category, after accounting for differences in gene pair alterations across patients. Radiation and tyrosine kinase inhibitors were the treatment categories that consistently ranked highest across patients, while PI3K inhibitors and DNA damage agents consistently ranked lowest. Pairwise Holm t-tests indicated that metabolic agents and receptor tyrosine kinase inhibitors consistently showed no significant difference in POS among patients who received either treatment in real life (p0.05). Conclusion: Our findings demonstrate strong internal consistency with a ML-based approach to predict survival using gene pair and treatment data. Promising external validity is shown by POS within 8% of observed outcomes. Further calibration and validation of this model with subtype-stratified cohorts is warranted to improve validity, enhance clinical utility and maintain predictive stability. Citation Format: Rishi Nair, Nicholas R. Mistry, Roy Khalife, Anthony M. Magliocco. Validation of a gene pair-based machine learning model for treatment prioritization in breast cancer abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 67.

Mark Helpful

Bookmark

Relay