What question did this study set out to answer?

This research aims to enhance personalized cancer vaccine development by improving epitope selection through knowledge mining.

April 5, 2026

Abstract 6698: EpitopeMiner: Scalable knowledge mining for evidence-driven personalized cancer vaccine design

Key Points

This research aims to enhance personalized cancer vaccine development by improving epitope selection through knowledge mining.
Predicted 25,966 tumor-specific epitopes from whole-genome sequencing of tumor-PBMC pairs.
Implemented a screening module to identify exact and partial matches from an in-house database.
Utilized OpenAI’s LLM with a RAG database to retrieve relevant literature and provide evidence-driven responses.
Identified 7 exact matches and 17.6% with at least 6 partial matches among predicted epitopes.
Achieved a processing time of 0.97 seconds per epitope.
Outperformed ChatGPT and Gemini in relevant information retrieval, achieving 100% evidence coverage.

Abstract

Abstract Background: Personalized cancer vaccines hold great promise by eliciting tumor-specific immune responses 1-3. A key challenge is identifying the right targets — immunogenic protein sequences, or epitopes, presented on tumor cells. While computational pipelines can predict epitope candidates from tumor sequencing, experimental validation is costly and slow. Leveraging literature and database knowledge could bridge this gap by enabling evidence-driven selection of high-confidence targets, but is constrained by fragmented information across journals and immunology databases 4-5. We introduce EpitopeMiner, which integrates sequence-based candidate screening with evidence-driven knowledge retrieval for epitope prioritization. Methods: A total of 25,966 tumor-specific epitopes were predicted from whole-genome sequencing of tumor-PBMC pairs from nine patients (including lung, sarcoma, NKTL, DLBCL) using a standard workflow: HLA typing (OptiType), variant calling (Strelka with wANNOVAR), MHC binding prediction (NetMHCpan) and RNA-supported protein-altering filtering. EpitopeMiner combines OpenAI’s Large Language Model (LLM) with an in-house Retrieval Augmented Generation (RAG) database comprising (a) 78,461 full-text research articles from PMC, PLOS One, and Europe PMC and, (b) ∼2.6 million unique epitopes from IEDB, dbPepNeo, SystemMHC, TANTIGEN, and caAtlas database. EpitopeMiner includes: (i) a screening module that processes an epitope list, detecting exact or ≥ 7 amino acid partial matches from the in-house database, and (ii) a reporting module that analyses each top-ranked hits, defined by highest sequence similarity and evidence density, to generate an LLM response covering 28 immunology keywords with citations. Results: Among the 25,966 epitopes predicted from the nine patients, EpitopeMiner found 7 exact matches, and 17.6% had ≥ 6 partial matches; mean processing time per epitope was 0.97 seconds. In benchmarking with 3 lung cancer driver-gene epitopes (KITDFGRAK, ITDFGRAKL, TDFGRAKLL), EpitopeMiner outperformed ChatGPT and Gemini, returning the highest amount of relevant immunological information — summarized as (total responses, % with evidence) — (18, 100%), (8, 100%), and (2, 100%) respectively, compared to ChatGPT’s (10, 70%), (4, 50%), (6, 33%) and Gemini’s (7, 0%), (1, 0%), (1, 0%). In addition, EpitopeMiner retrieved ≥10 partial matches for each epitope, whereas ChatGPT retrieved total of 3 and Gemini none. Conclusion: We built EpitopeMiner, a computational framework for sustainable literature and database curation. In a 9-patient dataset, EpitopeMiner retrieved experimentally and clinically validated epitope evidence at a scale and speed infeasible with manual analysis. EpitopeMiner outperformed general-purpose LLMs with cited responses, achieving 100% evidence coverage on benchmarks, reducing hallucinations and improving reliability. Citation Format: Agamjyot Singh Chadha, Isaac Jiasheng Cheong, Marcia Zhang, Wei Kit Tan, Wei Lin Tang, Jing Quan Lim, Solomonraj Wilson, Choon Kiat Ong, Bernett Lee, Chwee Ming Lim, Olaf Rotzschke, Mai Chan Lau. EpitopeMiner: Scalable knowledge mining for evidence-driven personalized cancer vaccine design abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 6698.

Bookmark

Abstract 6698: EpitopeMiner: Scalable knowledge mining for evidence-driven personalized cancer vaccine design

Key Points

Abstract

Cite This Study