What does this research mean for the field?

ConfBiXtCPI is a trustworthy framework for compound-protein interaction prediction that achieves state-of-the-art accuracy while providing interpretability and uncertainty quantification. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

The study aims to create a reliable and interpretable model for predicting compound-protein interactions using deep learning and conformal prediction methods.

March 8, 2026

Trustworthy Compound-Protein Interaction Prediction with Interpretable and Conformalized Cross-Attention Transformers.

Key Points

The study aims to create a reliable and interpretable model for predicting compound-protein interactions using deep learning and conformal prediction methods.
Developed ConfBiXtCPI framework combining bidirectional cross-attention transformers and Mondrian conformal prediction.
Applied the model to highly imbalanced datasets for training and validation across multiple benchmarks.
Incorporated conformal selection procedures to manage false discovery rates and allow user-defined risk thresholds.
Achieved state-of-the-art accuracy in predicting compound-protein interactions on various benchmarks.
Provided mechanistic interpretability with attention maps that highlight important binding sites.
Enabled valid uncertainty estimation for both majority and minority classes, enhancing active learning efficiency.

Abstract

Deep learning has accelerated drug discovery by enabling large-scale virtual screening, but current models often act as "black boxes" and provide no formal guarantees about prediction reliability. This limitation is particularly critical for compound-protein interaction (CPI) prediction, where data sets are highly imbalanced and erroneous predictions can lead to costly failures. Here we introduce ConfBiXtCPI, an integrated framework that unifies accurate prediction, interpretability, and statistically rigorous uncertainty quantification. At its core is a bidirectional cross-attention transformer that captures molecular recognition patterns from sequence-level inputs, achieving state-of-the-art accuracy across multiple benchmarks. To address class imbalance and uncertainty, we incorporate Mondrian conformal prediction, which guarantees valid coverage for both majority and minority classes. Building on this, a conformal selection procedure enables principled control of the false discovery rate, allowing users to specify risk thresholds while maintaining discovery power. Beyond accuracy, ConfBiXtCPI provides mechanistic interpretability through attention maps that localize to biophysically relevant binding sites, and its uncertainty estimates support efficient active learning strategies. Together, these advances establish ConfBiXtCPI as a trustworthy and practical tool for guiding experimental validation and accelerating therapeutic discovery.

Bookmark

Trustworthy Compound-Protein Interaction Prediction with Interpretable and Conformalized Cross-Attention Transformers.

Key Points

Abstract

Cite This Study