Extrachromosomal circular DNAs (eccDNAs) are closed circular DNA molecules widespread across eukaryotic cells, with emerging roles in gene regulation and tumor progression. Experimental assays remain costly and incomplete, underscoring the need for computational approaches. To address this, a deep learning framework termed DeepECC has been established to overcome the challenges posed by eccDNA heterogeneity and its complex biogenesis. Through a two-stage training strategy, DeepECC models the local sequence context flanking both the start and end breakpoints, thereby capturing mechanistically informative features that are often overlooked when analyses focus solely on eccDNA body sequences. Applied to multi-species (human, mouse, gallus) datasets, DeepECC robustly captures conserved breakpoint features, with a marked preference for GC-rich and transcriptionally active regions. Genome-wide scanning reveals non-uniform distributions of human cancer eccDNAs enriched in enhancers, expression quantitative trait loci, and CTCF sites, suggesting regulatory functions in tumor progression. Motif analysis further implicates ribosomal activity, translational regulation, and DNA damage response. Furthermore, genome-wide eccDNA predictions are integrated into the UCSC Genome Browser, enabling convenient querying and visualization of cancer-related eccDNAs associated with specific genes or genomic regions, facilitating functional interpretation for experimental research. Collectively, DeepECC provides a generalizable framework for systematic eccDNA discovery and insights into their functional significance in cancer.
Wang et al. (Tue,) studied this question.