Abstract Clear-cell renal-cell carcinoma (ccRCC) accounts for 70-80% of kidney cancers, with increasing incidence driven by improved imaging, aging populations, and rising obesity rates. No routine diagnostic tests currently predict which small renal masses will progress to large, aggressive tumors, leaving a fundamental clinical question unresolved: do lethal tumors arise with inherently aggressive molecular characteristics, or do they gradually evolve from indolent precursors? To investigate size-specific evolutionary trajectories, we integrated cancer effect size quantification, mutational signature analysis, phylogenetic reconstruction, and machine learning classification. We calculated cancer effect sizes stratified by tumor size (≤3 cm vs 3 cm) using single-nucleotide variant data from TCGA-KIRC (n=339) combined with five additional studies (total n=656). We additionally leveraged multi-region tumor sequencing data from 20 patients to investigate the evolutionary history of these tumors by performing Bayesian phylogenetic reconstruction to generate chronograms. To characterize ancestral tumor states, we developed a novel binomial sampling simulation parameterized by variant allele frequencies to generate mid-branch sequence states representing likely ancestral tumor configurations, which were then classified as "small" or "large" using a neural network trained on mutational matrices incorporating recurrent cancer effect sizes and de novo mutational signature weights. Our neural network classified ccRCC size with 86.48% accuracy and an F1 score of 0.86. Among the patient subset with large tumors, half showed small-like ancestral midpoints while half showed large-like midpoints; in contrast, small tumors overwhelmingly classified as having small ancestral states, supporting model reliability. We also found that VHL mutations clustering around the binding pocket exhibited the highest cancer effects, supporting observations from the literature. Interestingly, while feature importance analysis identified approximately 1000 variants contributing discriminatory power, only one VHL mutation (a truncation mutation) was in the 100 most informative variants, despite VHL's overall importance in ccRCC and may be especially influential on evolutionary trajectory. These results suggest that ccRCC tumors follow heterogeneous evolutionary paths—some large tumors pass through small-like feature states while others do not. This framework demonstrates the utility of evolutionary embeddings for machine learning classifiers and offers a generalizable approach for investigating ancestral tumor characteristics. Citation Format: Nic Fisk, Christopher Cross, Brian M. Shuch, Jeffrey Peter Townsend. Phylogenetic and machine learning analyses of simulated ancestral sequences reveals heterogeneous ccRCC evolution abstract. In: Proceedings of the American Association for Cancer Research Annual Meeting 2026; Part 1 (Regular Abstracts); 2026 Apr 17-22; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2026;86(7 Suppl):Abstract nr 707.
Building similarity graph...
Analyzing shared references across papers
Loading...
Nic Fisk
Christopher N. Cross
Brian M. Shuch
Cancer Research
Yale University
University of Rhode Island
University of New Haven
Building similarity graph...
Analyzing shared references across papers
Loading...
Fisk et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69d0b028659487ece0fa63e8 — DOI: https://doi.org/10.1158/1538-7445.am2026-707