Current structure-based molecular generation faces a fundamental dilemma: While static ligand modeling dominates computational approaches, real-world molecular interactions are inherently dynamic. Inspired by the conformational changes ligands undergo during semi-flexible docking, we propose a reinforcement learning (RL)–steered diffusion framework for semi-flexible molecular generation in protein pockets. By defining the denoising process as a Markov decision process, RL dynamically adjusts molecular structures through iterative exploration. Simultaneously, we incorporate multiple molecular properties as conditions to constrain the denoising policy to drug-like regions and perform self-supervised rigid training on both target-free and target-specific molecules. In addition, we propose a fast sampling strategy that accelerates sampling by 20 times, thereby improving the efficiency of training and sampling. Experiments demonstrate that our method outperforms state-of-the-art methods with a Vina score of −7.23 kcal/mol and an 11.53% success rate. Targeting unseen real-world proteins, the generated molecules preserve canonical interaction patterns while discovering previously unknown binding chemotypes.
Zhang et al. (Wed,) studied this question.