What question did this study set out to answer?

The research aims to develop a framework for generating multi-target drugs using large language models for improved drug design.

April 13, 2026Open Access

LaMGen: LLM-based 3D molecular generation for multi-target drug design

Puntos clave

The research aims to develop a framework for generating multi-target drugs using large language models for improved drug design.
Introduced the LaMGen framework powered by large language models.
Utilized a dataset (MTD2025) with 600,000 molecular conformations and 700,000 multi-target associations.
Integrated ESM-C protein embeddings and rotation-aware ligand tokens for capturing target-ligand interactions.
LaMGen outperformed diffusion-based models in generating molecules.
Achieved an average generation time of 0.44 seconds per molecule.
Reproduced known active molecules and generated novel candidates with higher binding affinities.

Resumen

Multi-target drugs hold great promise for treating complex diseases, yet existing methodologies predominantly rely on ligand-based approaches, which lack sufficient biological context and are often confined to specific target pairs, resulting in limited generalizability. Here, we introduce LaMGen, a general-purpose multi-target drug design framework powered by large language models (LLMs). Built on MTD2025, a dataset comprising over 600,000 quantum-accurate molecular conformations and 700,000 multi-target associations, LaMGen directly yields energy-favorable conformations with quantum-level accuracy. The framework integrates ESM-C protein embeddings, rotation-aware ligand tokens, and a TriCoupleAttention module to capture multi-level target–ligand interactions. Across independent benchmarks, LaMGen outperforms diffusion-based model across multiple properties, generating molecules in an average of 0.44 s, while preserving high conformational plausibility. Retrospective analyses demonstrate that LaMGen not only can reproduce molecules identical to known actives, but also consistently produces structurally novel candidates with conserved core scaffolds and superior binding affinities. Designing effective multi-target therapeutics remains a major challenge, as existing ligand- or protein-centric methods struggle to generate biologically contextualized, spatially valid 3D molecules, particularly for triple-target systems. This study introduces LaMGen, an LLM-powered framework that leverages large-scale protein-ligand data and rotation-aware molecular encoding to rapidly produce chemically plausible multi-target candidates, achieving strong zero-shot generalization, superior molecular quality, and robust performance across dual- and triple-target design tasks.

Me gusta

Guardar

Ver artículo completo