Abstract Introduction: Genomic profiling of circulating tumor DNA (ctDNA) through liquid biopsies has become an important diagnostic method in clinical oncology. However, detection of variants related to clonal hematopoiesis (CH) is a major confounder that impairs the clinical utility of liquid biopsies. Strategies that reduce biological noise from CH in plasma NGS include deep sequencing of matched WBC DNA and/or tumor tissue sequencing. While these methods effectively distinguish most CH variants, the need for extra biospecimens and sequencing raises costs and limits feasibility. Methods: Using a training cohort of 426 variants identified in ctDNA NGS from 225 patients with stage I-IV solid tumors, we developed plasmaCHORD, a machine learning model (MLM) that includes fragmentomic, variant, and patient-level features to distinguish between tumor- and CH-origin for mutations detected by fixed gene panel hybrid capture NGS. Model performance was assessed by comparison to the reference origin of each plasma variant determined from matched WBC and tumor NGS. Following locking the model parameters, we applied plasmaCHORD to an independent validation cohort of 1,412 plasma variants detected in 114 patients with metastatic cancers, as well as to cfDNA NGS from patients enrolled in a prospective liquid biopsy-informed clinical trial (NCT05585684). Results: PlasmaCHORD predicted tumor versus CH-origin in the training set with high accuracy (cross-validated AUC=0.94), outperforming individual features such as variant allele frequency and canonical CH genes. Model performance remained robust when restricted to mutant DNA fragments supported by 3-5 mutant reads (AUC = 0.84). plasmaCHORD was locked for evaluation using a score of 0.5 as cutoff for distinguishing tumor- versus CH-origin variants. In the independent validation cohort, the locked model maintained similar overall accuracy (AUC=0.9) with a sensitivity of 82%, specificity 80.3% and accuracy of 80.2%. Our approach was shown to be highly reliable in classifying variant origin in clinically actionable genes not canonically associated with CH, including AKT1, ATM, BRCA1, BRCA2, and EGFR, as well as adjudicating cellular origin for TP53 mutations that are encountered in both solid and hematologic malignancies. Performance was consistent across cancer types, sequencing platforms, mutation classes, and a wide range of allele fractions. When applied to clinically challenging cases in the context of a precision oncology clinical trial, plasmaCHORD precisely determined variant origin, preventing mismatches with genotype-targeted therapies. Conclusions: plasmaCHORD, a multi-feature machine-learning classifier, can significantly enhance the ability to identify bona fide tumor variants in routine plasma-only NGS, addressing a critical need in implementing liquid biopsy-guided therapy by minimizing misinterpretation caused by CH. Citation Format: Daniel J. Rabizadeh, Jenna VanLiere Canzoniero, Ilias Ziakas, Jaime Wehr, Archana Balan, Amna Jamali, Blair V. Landon, Susan Combs Scott, Gavin Pereira, Vincent K. Lam, Christine L. Hann, Christine M. Lovly, Jessica Tao, Patrick M. Forde, Joseph C. Murray, Mark Sausen, Gerrit A. Meijer, Geraldine Vink, Remond J. A. Fijneman, Victor E. Velculescu, Jillian Ayn Phallen, Robert Scharpf, Valsamo Anagnostou. PlasmaCHORD- A machine learning method for identifying clonal hematopoiesis variants in liquid biopsies abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 95.
Rabizadeh et al. (Fri,) studied this question.