Increasing genomic data are driving changes in the selection of phylogenetic markers and analysis strategies. Databases enable the extraction of established markers, such as single and multilocus sequence typing (MLST), but are often limited by the number of informative sites or availability with respect to incomplete source data sets or reductive evolution in bacteria such as the Mollicutes. Genome-wide analyses like average nucleotide identity (ANI) often overcome these problems but also depend on the alignment percentage. Complementary analyses help validate results and address limitations of primary approaches. However, how genome-wide compositional signals and reduced core gene sets affect phylogenomic resolution across a large and taxonomically diverse dataset of complete Mollicutes genomes remains unclear. Therefore, we applied an advanced MLST approach based on single-copy orthologs (SCOs), alongside codon usage analysis. The reliability and impact of these approaches were first analyzed using Acholeplasmatales as the foundation, with 16S rRNA gene, ANI, SCOs, and codon usage. Codon usage analysis revealed lineage-associated compositional signatures across the 52 strains that were broadly consistent with current genus and subgroup assignments, whereas ANI and 16S rRNA gene identified species with ≥96.5% and ≥97%, respectively. Among these, SCOs showed the most matches to the current taxonomy, supporting the approach being extended to Mollicutes. Applied to 807 Mollicutes strains, the analysis revealed 16 shared SCOs. Concatenation of this core set significantly enhanced phylogenomic resolution, providing a robust framework for reconstructing evolutionary relationships within Mollicutes.
Ilic et al. (Tue,) studied this question.