Accurate identification of compound-protein interactions (CPIs) is critical for drug discovery. In recent years, neural network-based CPI prediction methods have demonstrated remarkable performance. However, most existing approaches primarily focus on interaction patterns derived from known data, without fully leveraging both intra- and inter-molecular interaction information within compound-protein pairs. This limitation constrains the representation learning capability of models for compounds and proteins, hindering further improvements in predictive accuracy. In this paper, we propose DualBAN, a novel CPI prediction model that integrates intra- and inter-molecular interaction information from both compounds and proteins. Specifically, DualBAN employs pretrained biological large language models to obtain sequence features and extracts atomic features of compounds and residue representations of proteins. To comprehensively capture intra- and inter-molecular interactions, DualBAN fuses atomic and residue representations using a bilinear attention network and combines sequence representations through cross-attention, jointly utilizing both components for CPI prediction. Extensive experiments demonstrate that the proposed DualBAN significantly outperforms state-of-the-art methods on CPI prediction tasks and maintains robust performance under cross-domain and cold-start settings.
He et al. (Thu,) studied this question.