Single-step retrosynthesis prediction aims to identify reactants for synthesizing a target molecule and is crucial for pathway planning. Despite improved prediction accuracy, existing methods often struggle with performance disparities between high- and low-resource reaction classes. This limitation hampers the overall effectiveness of the retrosynthesis prediction methods. Here, we introduce two novel strategies─Retrosynthetic Mutual Distillation (Retro-MD) and Retrosynthetic Self-Distillation (Retro-SD)─leveraging distillation learning techniques to bridge this performance gap. Retro-MD uses dual sampling temperatures and cross-model knowledge transfer, while Retro-SD leverages a fixed temperature and self-distillation from prior iterations. Evaluations on Transformer-based models show a state-of-the-art performance among template-free approaches. Ablation studies further validate the rationale of reaction-class-aware task partitioning, demonstrating the robust effectiveness of our proposed distillation strategies.
Liu et al. (Mon,) studied this question.