• GLASS2 expands GPCR–ligand data by 72%, reaching >1M curated interactions. • LLM-assisted text mining adds PubMed evidence to integrated major drug databases. • Structures unified by InChIKey and affinities normalized to nM. • Task-ready ML datasets with clear actives/inactives and Ki/Kd/IC50/EC50 regression. • Open web portal enables structure search and links to GPCR structural resources. G protein-coupled receptors (GPCRs) represent one of the most important drug target families, yet comprehensive and standardized GPCR-ligand interaction data remain fragmented across multiple resources. Here, we present GLASS2, a substantially expanded and methodologically enhanced database of GPCR-ligand associations. By integrating data from major pharmacological databases with a large language model (LLM) powered text-mining pipeline applied to PubMed literature, GLASS2 delivers a 72% increase in total associations over its predecessor, encompassing 1,715 receptors and 450,736 ligands across more than one million unique interactions. Critically, GLASS2 provides task-ready classification and regression datasets with clearly defined positive and negative samples, directly supporting AI-driven drug discovery applications. The database is freely accessible through an intuitive web portal at https://zhanggroup.org/GLASS/ , offering seamless navigation, structure-based search, and integration with GPCR structural resources.
Xu et al. (Sun,) studied this question.