What question did this study set out to answer?

The aim is to develop a robust AI-driven model for molecular property prediction using a new contrastive learning framework.

April 25, 2026

Fragment-Aware Contrastive Learning Framework for Molecular Property Prediction

Key Points

The aim is to develop a robust AI-driven model for molecular property prediction using a new contrastive learning framework.
Introduced a novel method for constructing contrastive pairs based on molecular fragment contributions.
Utilized information bottleneck theory to evaluate fragment importance for molecular properties.
Implemented an improved quadruplet loss in the contrastive learning framework.
Achieved outstanding performance on the MoleculeNet benchmark.
Delivered promising results predicting diverse pharmacokinetic and toxicity properties.

Abstract

Artificial intelligence (AI)-driven molecular property prediction holds significant potential to accelerate drug discovery, yet the development of robust models is hindered by scarce, high-quality data and the diversity of prediction tasks. Although self-supervised learning (SSL), especially contrastive learning, has gained traction for molecular representation learning (MRL), the intrinsic structural integrity of molecules presents a unique challenge: it obstructs the straightforward creation of meaningful contrastive pairs. This often leads to suboptimal pretraining representations and, consequently, diminished downstream task performance. To overcome this limitation, we introduce a novel contrastive pair construction strategy based on molecular fragment contributions. Our method enables the learning of a higher-quality embedding space by utilizing information bottleneck theory to evaluate the importance of individual fragments for molecular properties─without relying on external prior knowledge. We implement a contrastive learning framework enhanced with an improved quadruplet loss that more effectively captures fine-grained molecular similarities. Empirical evaluations demonstrate that our approach achieves outstanding performance on the MoleculeNet benchmark and delivers promising results in predicting diverse pharmacokinetic (PK) and critical toxicity properties, highlighting its potential for real-world drug discovery applications.

Mark Helpful

Bookmark

Relay