We developed a machine learning pipeline to help predict the efficacy in vivo of T cell receptors (TCRs). We examine the series of 11 class II MHC-restricted TCRs in a murine cancer model reported by Wolf et al., 2024. Sci. Immuno. , https://doi.org/10.1126/sciimmunol.adp6529, and reactive to an immunodominant antigen, mL9 presented in MHC molecule, I-Ek. When engineered into T cells, some of the TCRs-induced remissions, reducing tumor volumes, while others were ineffective, or had intermediate effects. To train our model, we used structural predictions from AlphaFold3 and data from the enhanced sampling technique of Gaussian accelerated molecular dynamics simulations (GaMD). We then extracted 18 features capturing pairwise contact patterns, correlation networks, and H-bonds between all amino acids in the TCR α and β chains, MHC α and β chains, and the peptide antigen. Using Shapley additive explanations (SHAP) analysis, we trimmed the original 18 features to the six most contributory—MHCα-peptide contact, TCRα-peptide correlation, TCRβ-peptide correlation, MHCα-peptide correlation, MHCβ-peptide correlation, and TCRα-MHC H-bonds—which were then used to train our model. The model was trained using eXtreme gradient boosting classifiers (XGBClassifiers), with hyperparameters tuned via successive halving. The model was subsequently validated via 11-fold cross-validation, where each of the 11 TCRs was used once as a fold. Our model outputs class probabilities, estimating the likelihood of each TCR as effective, intermediate, or ineffective. To assess generalizability, we predicted the efficacy of five new TCRs that are currently being tested experimentally. This work aims to provide a computational framework for prioritizing TCRs for adoptive T cell therapy, enabling in silico pre-screening of novel TCR sequences prior to costly experimental validation.
Lee et al. (Sun,) studied this question.