What question did this study set out to answer?

To enhance molecular property prediction by integrating multitask self-supervised pretraining with multimodal fine-tuning.

June 3, 2026

MPMFMol: Multitask Self-Supervised Pretraining with Multimodal Fine-Tuning for Molecular Property Prediction

Puntos clave

To enhance molecular property prediction by integrating multitask self-supervised pretraining with multimodal fine-tuning.
Developed MPMFMol framework integrating contrastive learning and multitask objectives.
Constructed heterogeneous augmented views using molecular fragments for pretraining.
Implemented stage-aware modality fusion during fine-tuning with functional group and SMILES sequences.
MPMFMol significantly outperformed state-of-the-art models on six classification datasets and three regression datasets, enhancing predictive accuracy.
Achieved high-quality representations with reduced reliance on negative sampling.
Demonstrated effective integration of graph, fingerprint, and sequence data, leading to superior model performance.

Resumen

Molecular property prediction is essential in drug discovery for early-stage compound evaluation. Recently, contrastive learning has demonstrated significant potential under limited labeled data by constructing augmented views. However, current augmentation strategies often disrupt molecular semantics and ignore chemical priors, limiting representation quality. Moreover, molecular data is inherently multimodal, including graphs, fingerprints, and sequences, yet how to effectively integrate their complementary information remains challenging. Therefore, we propose MPMFMol, a unified framework that integrates multitask self-supervised pretraining with multimodal fine-tuning for molecular property prediction. During pretraining, we construct heterogeneous augmented views based on molecular fragments to preserve original molecular semantics, enabling the graph encoder to capture fragment-level information. Meanwhile, fingerprint features are integrated into a multitask learning objective, reducing reliance on negative sampling and enhancing the encoder's representation capability. During fine-tuning, we further incorporate functional group and SMILES sequence information and design a stage-aware modality fusion strategy. Specifically, pretrained graph features are injected into the initial representation of functional groups to guide feature extraction and then fused with SMILES features to enable deep cross-modal interaction and enhance downstream predictive performance. Experimental results on six classification and three regression data sets demonstrate that MPMFMol outperforms state-of-the-art baselines.

Me gusta

Guardar

Me gusta

Guardar

MPMFMol: Multitask Self-Supervised Pretraining with Multimodal Fine-Tuning for Molecular Property Prediction

Puntos clave

Resumen

Cite This Study