Catalysts accelerate chemical processes by lowering activation energy through alternative reaction pathways. However, existing prediction methods often struggle to capture intricate molecular information and overlook critical structural details such as atomic composition and bond topology. To address these challenges, we propose CataCon, a deep neural network framework grounded in contrastive learning. The model initially utilizes GraphSAGE to generate robust molecular graph embeddings for reactants, products, and candidate catalysts, thereby capturing their structural features comprehensively. Subsequently, a contrastive learning module constructs positive and negative sample pairs to align feature representations between reactant-product combinations and catalysts. This alignment enhances the model's capacity to identify potential interactions. Experimental results on public catalytic reaction datasets demonstrate that CataCon significantly outperforms baseline methods in catalyst classification tasks. Ablation studies further confirm the efficacy of combining graph representation learning with contrastive strategies for integrating multidimensional molecular information. Moreover, t-SNE visualization and interpretability analyses elucidate the underlying decision mechanisms of the model and provide novel perspectives for understanding the complex relationships among reaction components. Scientific contribution This study introduces CataCon, a novel contrastive graph representation learning framework designed to predict optimal catalysts for chemical reactions. This approach overcomes the limitations of existing methods that rely on simplistic catalyst labels by generating rich structural embeddings for all reaction components and aligning reaction and catalyst features through contrastive learning. By providing a powerful tool for the rational screening of catalyst candidates, CataCon achieves superior accuracy and has the potential to significantly accelerate materials discovery and optimize chemical synthesis.
Shi et al. (Tue,) studied this question.