Key points are not available for this paper at this time.
Abstract Validating chemical synthesis success requires confirming the desired product using various analytical techniques. While spectroscopic data collection is increasingly automated, interpreting results remains a major bottleneck, often requiring expert input. With advances in laboratory automation and high‐throughput synthesis, this challenge is expected to intensify. We introduce the MultiModalSpectralTransformer (MMST), a machine learning method that predicts chemical structures directly from diverse spectral data (NMR, IR, and MS). Trained on 4 million simulated compounds, MMST achieves 72% and 80% as top‐1 and top‐3 accuracy, respectively. To address out‐of‐distribution challenges, we implemented an active learning improvement cycle that generates molecules in similar chemical spaces, enabling the model to adapt to chemical structures beyond its original training data. We demonstrate MMST's capabilities through comprehensive benchmarking across diverse molecular weight ranges and chemical spaces. Notably, despite training solely on simulated data, MMST demonstrates good performance with experimental spectra. This research represents a significant advancement in automated structure elucidation, offering a powerful and adaptable tool that bridges the gap between simulated and real‐world data.
Lemurell et al. (Wed,) studied this question.