G-quadruplexes (G4s) are noncanonical DNA secondary structures formed by runs of guanines (stems) connected by other nucleotides (loops). These structures are enriched at regulatory regions such as promoters, CpG islands, untranslated regions (UTRs), enhancers, and replication origins, where they play key roles in transcription and replication. Although prior studies have demonstrated that G4s exhibit higher mutation rates than canonical DNA, little is known about the substitution patterns and selection acting specifically on G4 stems and loops. In this study, we utilized Telomere-to-Telomere (T2T) genome assemblies from human and two non-human great apes (chimpanzee and Bornean orangutan) to analyze substitution spectra and selective constraints within G4s, focusing on differences between stems and loops. We observed that fixed nucleotide substitutions leading to the gain or loss of G4 structures are more frequently located at stems, while those in G4s conserved across species are more often found at loops. On the other hand, single nucleotide polymorphisms had similar frequencies between stems and loops. To evaluate selection, we employed two approaches: we computed the ratio of substitution to polymorphism frequencies at stems vs. loops and performed phylogenetic modeling using PhyloFit. Both methods consistently revealed that stems of shared G4s experience stronger purifying selection than loops, particularly at promoters, enhancers, and UTRs. Our results provide novel insights into the sequence variation and selection of G4s, informing our understanding of their contributions to genome evolution and function.
Zhang et al. (Sat,) studied this question.