Abstract Introduction: Transfer learning considers distinct but related tasks defined over heterogeneous domains like patient or organoid data, and improves generalization and predictive performance through knowledge transfer between tasks. It can be especially advantageous in applications where training data is limited (i.e. small patient cohorts), where joint learning across domains can enable inference in otherwise underpowered datasets. Methods: We present a novel Bayesian transfer learning framework that supports multi-task and multi-modal learning across scales, from bulk to single-cell resolution. Our approach is generative and learns latent space representation within each domain, simultaneously across multiple domains, using a feature-wise prior (e.g. genes, drugs, cellular programs) to model complex non-linear relationships. Our model can be pre-trained on an unlimited number of patient cohorts or new approach methodology (NAM) datasets from diverse assay bulk or single-cell platforms to make predictions in previously unseen samples. Results: We apply our method to predict drug response and identify gene signatures for therapy stratification in acute myeloid leukemia (AML). We benchmark our model’s performance in a battery of experiments and compare to five existing approaches. By integrating disjoint large-scale patient cohorts, we enable robust statistical inference in an otherwise underpowered dataset (N=29). Our model successfully transferred information even from cohorts with molecularly characterized samples that lacked matched drug response to inform and improve predictions in other cohorts with statistical significance and impactful effect size. Our approach was especially effective in modeling multi-kinase inhibitors, where our feature-wise priors captured multi-target interactions. Conclusion: Our novel approach enables joint learning across unlimited number of patient cohorts and other multi-omic and functional data domains and scale. In AML we achieve significant improvements in drug response accuracy and gene signature identification and enable robust statistical inference in very small patient cohorts. Our framework is scalable, interpretable, and adaptable across target phenotypes, offering a robust solution for a wide range of heterogeneous problems. Citation Format: Dharani Thirumalaisamy, Evan F. Lind, Elie Traer, Jeffrey W. Tyner, Mehmet Gönen, Olga Nikolova. Large-scale patient cohorts integration via a novel Bayesian transfer learning framework identifies robust drug response signatures in AML abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 6908.
Thirumalaisamy et al. (Fri,) studied this question.