Natural oligosaccharides and aminoglycosides are important sources of new drug candidates, especially in the development of antibiotics. In the past, discovering novel saccharides has been time-consuming and costly. However, the rapid expansion of high-throughput data, including genomic and mass spectrometry data sets, has greatly increased opportunities for natural saccharide discovery. Yet, due to the complex biosynthesis pathways of saccharides, no existing method can predict their structures with high precision. To address this, we introduce Seq2Saccharide, a tool designed to automate saccharide natural product discovery by integrating both genomic and mass spectrometry data. To enhance accuracy, Seq2Saccharide predicts hundreds or thousands of putative structures for each gene cluster. The correct structure is then identified from these predictions using a mass spectral search. Benchmarks against saccharides in the MiBIG database show that Seq2Saccharide outperforms existing methods in predicting the structure of saccharides. Furthermore, mass spectrometry analysis indicates that the variable search module can correct mispredictions from genome mining. By searching genomic and mass spectrometry data of microbial strains, Seq2Saccharide correctly identified the biosynthetic gene cluster for the polysaccharide oligosaccharide trestatin B.
Building similarity graph...
Analyzing shared references across papers
Loading...
Donghui Yan
Bahar Behsaz
Yanjing Li
Journal of the American Chemical Society
University of Michigan
Carnegie Mellon University
Oregon State University
Building similarity graph...
Analyzing shared references across papers
Loading...
Yan et al. (Thu,) studied this question.
www.synapsesocial.com/papers/68d46cd731b076d99fa695e2 — DOI: https://doi.org/10.1021/jacs.5c08251