ABSTRACT Molecular foundation models hold promise to provide accurate predictions for a large and diverse set of downstream tasks in bio‐medical research. Quality molecular representations are key and foundation model development has typically focused on a single representation or molecular view, which may have strengths or weaknesses on a given task. We develop Multi‐view Molecular Embedding with Late Fusion (MMELON), an approach that integrates pre‐trained graph, image and text foundation models and may be readily extended to additional views and models. The multi‐view model performs robustly and is validated on over 120 tasks, including molecular solubility, ADME properties, and activity against G Protein‐Coupled receptors (GPCRs). The GPCR model array is leveraged to perform a virtual screen in search of ligands binding to Alzheimer's disease related GPCRs. We identify a number of such targets and employ the multi‐view model to select strong binders from a compound screen. Predictions are validated through structure‐based modeling and identification of key binding motifs.
Suryanarayanan et al. (Wed,) studied this question.