Abstract Summary Annotation of HR-MS/MS spectra is a complex task that can be tackled either by expert interpretation or machine learning models that rely on large spectral/structural databases for training. Frequently, users want to find novel compounds of a particular substance class they are already familiar with. This requires the classification of detected compounds as 'relevant' (i.e., belonging to the compound class of interest) or not (i.e., 'other'). For such applications, the python-based AnnoMe software is presented that allows users to classify their experimental HR-MS/MS spectra according to their aims. By leveraging a user-curated dataset of 'relevant' and 'other' reference HR-MS/MS spectra alongside structure-informed embeddings (MS2DeepScore), the package enables rapid and accurate prediction of 'relevant' compounds with custom-trained classification models and a majority vote, facilitating exploration of the complex chemical space inherent to LC-HRMS/MS data. This software is demonstrated by predicting putative prenylated flavonoids for prioritization in natural product discovery. Availability and Implementation Code, documentation, and datasets are available at https://github.com/chrboku/AnnoMe and https://zenodo.org/records/16322488. Supplementary information Supplementary data are available at Bioinformatics Advances online
Christoph et al. (Tue,) studied this question.