Predicting protein ligand-binding pockets is crucial for understanding various biological processes, drug discovery, and design. Existing methods predominantly convert proteins into 3D voxels and process them using extensive convolutions, which struggle to effectively capture long-range semantic information within proteins. Furthermore, they lack global modeling and adaptive filtering of cross-layer features, limiting the precise characterization of pocket detail features. To tackle these issues, we propose a novel U-shaped network architecture that integrates spatial gating mechanisms and local feature enhancement for accurate protein-ligand binding pocket prediction. Specifically, we improve the traditional U-shaped network encoder by integrating the Mamba module and a Local Feature Enhancement (LFE) module to achieve efficient global modeling and adaptive enhancement of local features. Additionally, we introduce a novel Spatial Enhanced Mamba Gate (SEMG) module at skip connections to filter redundant information and enhance multiscale feature fusion. Experiments across extensive protein-ligand data sets demonstrate that our approach outperforms existing methods in both performance and interpretability.
Yang et al. (Wed,) studied this question.