Presentation given at the Ontologies4Chem Workshop 2025. Abstract: Bacteria are capable of synthesizing natural products through enzymatic pathways that are encoded in loci known as Biosynthetic Gene Clusters (BGCs). To elucidate the chemical structure of the molecules produced by these BGCs, we developed CHAMOIS, a data-driven, machine-learning method for predicting chemical features of a BGC molecule from its protein domain composition. CHAMOIS uses the ChemOnt ontology as a label space, although we also attempted to use ChEBI for the same purpose. In this talk, we would like to present a practical use of chemical ontologies for machine learning, and some limitations in existing chemical ontologies.
Martin Larralde (Thu,) studied this question.