Drug discovery is a complex and costly process, often taking over a decade from target identification to FDA approval, with many candidates failing along the way. AI foundation models, applied to vast datasets of small molecules, proteins, and transcriptomic (or more broadly, omic) data, are transforming biomedical research by accelerating target identification, drug design, and testing. A promising and ambitious goal is to leverage these models to construct a virtual cell capable of simulating health and disease. Two key challenges must be addressed to achieve this goal: 1. Comprehensive molecular representation – While molecular graphs, images, and text are essential for accurate modeling, previous work has typically focused on single representations. 2. Integration of diverse data modalities – Predicting complex biological interactions (e.g., antibody-protein binding) requires combining RNA, protein, and small molecule data. This talk presents two complementary approaches to address these challenges: 1. Multi-view Molecular Embedding with Late Fusion (MMELON) – Pre-trained on datasets of up to 200M molecules, aggregated into combined representations 1. 2. Molecular Aligned Multi-Modal Architecture and Language (MAMMAL) – Trained on over 2B data points, integrating small molecules, proteins, and singlecell RNA-seq data 2. Both approaches achieve state-of-the-art results in multi-modal drug discovery.
Michal Rosen-Zvi (Thu,) studied this question.