Background: Chloroplast genomes provide a widely used genomic record for resolving phylogenetic relationships and locating informative variation for marker development, particularly in large, taxonomically complex plant groups such as Poaceae. Methods: We assembled a comparative dataset of 175 complete RefSeq plastomes representing broad phylogenetic and ecological diversity within grasses. Whole-plastome sequences were aligned (356,558 columns), filtered for ≥50% occupancy (138,148 sites), and analyzed for genome-wide nucleotide diversity (π), Shannon entropy, gene-level variability across core coding loci, phylogenomic structure using maximum likelihood, and episodic diversifying selection across core protein-coding genes using BUSTED with synonymous rate variation and FDR correction. Results: The filtered alignment contained extensive phylogenetic signal (84,164 variable sites) and showed moderate genome-wide divergence (mean π = 0.0507) distributed heterogeneously across the plastome. Sliding-window profiles identified discrete hypervariable regions, with prominent peaks near ~13.6–15.9 kb and ~105.6–107.2 kb in the reference coordinate system. Diversity was strongly compartmentalized: π and entropy were elevated in the single-copy regions (LSC/SSC) but markedly reduced across both inverted repeats (IRb/IRa), consistent with repeat-mediated homogenization and strong functional constraint on IR-enriched loci. Gene-level mapping revealed that the highest coding-sequence variability concentrated in photosystem II genes, especially psbI, psbD, and psbK, whereas gene-expression genes such as matK and rps16 were comparatively conserved. Whole-plastome phylogenomics recovered a strongly supported backbone consistent with accepted Poaceae radiations, resolving early-diverging lineages and the major BOP and PACMAD clades with coherent subfamily-level groupings. Codon-based tests detected significant evidence of gene-wide episodic diversifying selection in 13 plastid genes after FDR correction, led by rpoC2 and rpoA, with additional signals spanning ATP synthase, PSI/PSII, cytochrome b₆f, NDH, and envelope-associated genes. Conclusions: Dense plastome sampling across Poaceae reveals a compartmentalized evolutionary landscape in which the IR acts as a low-variation conserved backbone, while the LSC/SSC contain discrete, high-value hotspots for marker development. Episodic selection signals in transcriptional and photosynthetic genes suggest lineage-specific adaptive episodes superimposed on pervasive purifying selection. Together, these results provide a robust phylogenomic framework, a genome-scale diversity atlas, and a prioritized set of candidate loci for barcoding and evolutionary hypothesis testing in grasses.
Yasin Kaymaz (Sat,) studied this question.