With the advancement of deep learning, predicting drug–target binding affinity (DTA) has become a crucial task in computational drug discovery. In this study, we propose a novel DTA prediction framework that integrates multimodal feature fusion and structural modeling to comprehensively capture the complex interactions between drugs and targets. For drug representation, we extract semantic features from SMILES sequences using ChemBERTa and incorporate topological information via graph neural networks. For protein representation, we utilize the pretrained ESM-2 model to encode high-level sequence semantics and employ geometric vector perceptron and graph transformer modules to model 3D structural dependencies, thereby enabling joint modeling of protein topology and spatial geometry. We further adopt a multi-head attention mechanism and a gated feature fusion module to dynamically integrate multimodal features, thereby enhancing the model’s representational capacity. Extensive experiments on four benchmark datasets—Davis, KIBA, PDBbind, and BindingDB—demonstrate that our model significantly outperforms state-of-the-art methods in terms of MSE, CI, and r m 2 . In particular, the model exhibits strong generalization and ranking performance on structurally complex datasets such as PDBbind and BindingDB. Our approach offers a more accurate and interpretable solution for modeling drug–target interactions, with promising potential for real-world drug discovery applications. The complete implementation of our framework is publicly available at https://github.com/xu1nan/MUSDTA . Predicting drug-target binding affinity (DTA) is critical for accelerating computational drug discovery. We present MUSDTA, a deep‑learning framework that integrates multimodal biochemical information to achieve a comprehensive representation of drug‑target interactions. For drug molecules, semantic features are extracted from SMILES sequences using ChemBERTa, while topological information is captured via graph neural networks. For target proteins, high‑level sequence semantics are encoded by the pretrained ESM‑2 model, and 3D structural dependencies are modeled with a geometric‑vector‑perceptron network and graph transformers. A multi‑head attention mechanism and a gated fusion module dynamically combine the four modalities (drug sequence, drug graph, protein sequence, protein structure) into a unified representation, which is fed into a multilayer perceptron to predict binding affinity. Extensive experiments on Davis, KIBA, PDBbind, and BindingDB benchmarks show that MUSDTA significantly outperforms state‑of‑the‑art methods in MSE, CI, and rm2. The model exhibits strong generalization, especially on structurally complex datasets, and provides interpretable insights through ablation and pocket‑affinity analyses. MUSDTA offers a more accurate, structure-aware solution for modeling drug-target interactions, with promising potential for real-world drug discovery pipelines. • Multimodal Integration: Combines 1D sequences, 2D molecular graphs, and 3D protein structures for drug-target modeling. • SOTA Performance: Outperforms 7+ baselines across four benchmarks. • Structure-Aware: Binding pocket analysis shows Top-3 pockets enhance robustness, enabling generalization to unseen protein folds. • Open-Source Access: Freely available code/data at https://github.com/xu1nan/MUSDTA
Building similarity graph...
Analyzing shared references across papers
Loading...
Xu et al. (Wed,) studied this question.
synapsesocial.com/papers/69dc892e3afacbeac03eaed0 — DOI: https://doi.org/10.1016/j.compbiolchem.2026.109053
Yinan Xu
Jingdezhen Ceramic Institute
X. H. Xiao
Jingdezhen Ceramic Institute
Weizhong Lin
Jingdezhen Ceramic Institute
Computational Biology and Chemistry
Jingdezhen Ceramic Institute
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: