Abstract Introduction. Patient-derived xenograft (PDX) models preserve patient-specific tumor biology and clinical heterogeneity; however, selecting PDX models in the context of clinical and preclinical datasets is challenging due to varied criteria and characteristic differences across systems and datasets. Here, we apply the RNA1-DA foundation model, trained on over 180, 000 RNA-seq cancer samples from the Data4Cure Oncology Sample Universe, to a large cohort of more than 1, 500 low-passage PDX models from Champions Oncology’s TumorGraft bank, enabling alignment of samples across clinical and preclinical domains and translational PDX model selection. Methods. RNA1-DA was applied to 1, 549 low-passage PDX RNA-seq samples from Champions Oncology’s TumorGraft bank to derive domain-adapted sample embeddings. Joint embedding of Champions samples with 130, 313 public clinical and preclinical cancer RNA-seq samples was visualized with UMAP and sample integration was quantified by K-nearest-neighbor (KNN) disease classification. Molecular subtypes were transferred from clinical to Champions samples by KNN and evaluated for concordance with biomarker status, driver gene mutations, and drug response measured by tumor growth inhibition (TGI). Results. RNA1-DA-based Champions PDX sample embeddings were well-integrated with clinical and preclinical RNA-seq samples in the Oncology Sample Universe, achieving 75% accurate disease classification. Model-assigned cancer subtypes showed concordant molecular profiles across tumor and PDX samples, with breast cancer subtypes reproducing the expected ER/PR/HER2 marker status (p=2e-22) and showing comparable RNA marker expression to clinical tumors. Furthermore, 1964/2270 (87%) of subtype-mutation and subtype-copy number alteration associations in TCGA samples (q0. 01; across 14 cancer types) were recapitulated in Champions PDX samples with strong correlation of variant differential prevalence across subtypes (Pearson r=0. 78), supporting accurate subtype transfer and model translatability. Additionally, a number of PDX-transferred subtypes exhibited significant (p0. 05) associations with drug-induced in vivo response that align with known clinical drug response associations. Altogether, our results support the predictive and translational value of foundation model integration of PDX datasets for clinical efficacy signal discovery, patient stratification or biomarker validation. Citation Format: Edward O'Brien, Alex Moreau, Gervaise Henry, Gilad Silberberg, Tammer Farid, Roy Ronen, Janusz Dutkowski. Large-scale foundation model-based PDX model selection and cancer subtype assignment abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 2 (Late-Breaking, Clinical Trial, and Invited Abstracts) ; 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86 (8Suppl): Abstract nr LB435.
O'Brien et al. (Fri,) studied this question.