What question did this study set out to answer?

This research aims to improve breast cancer diagnosis from mammography by leveraging self-supervised learning techniques.

April 30, 2026

Self-Supervised Contrastive Learning With Attention Fusion for Enhanced Breast Cancer Diagnosis From Mammography

Key Points

This research aims to improve breast cancer diagnosis from mammography by leveraging self-supervised learning techniques.
Developed self-supervised anatomy-aware framework (SCL-AF) for mammographic image analysis.
Implemented contrastive pretraining with cross-view positives and contralateral hard negatives.
Conducted evaluations on public CBIS-DDSM dataset, focusing on ROC-AUC, PR-AUC, and sensitivity outcomes.
SCL-AF achieved ROC-AUC of 0.942, PR-AUC of 0.692, and sensitivity of 0.631, surpassing traditional methods.
Significant improvement noted in high-specificity contexts, especially for calcification-dominant breasts.
Ablation studies revealed critical features: loss of cross-view positives or contralateral negatives decreased performance.

Abstract

Screening mammography presents complementary craniocaudal and mediolateral oblique views whose joint interpretation hinges on view-invariance for the same breast and sensitivity to contralateral asymmetry. We propose a self-supervised anatomy-aware with attention fusion framework (SCL-AF) that couples contrastive pretraining with cross-view positives and contralateral hard negatives, a lesion-guided tokenization that distills high-resolution images into a compact set of clinically meaningful tokens, and a geometry-biased, bidirectional attention fusion that reconciles evidence across views. Supervised fine-tuning uses a class-imbalance-aware objective together with view consistency and contralateral symmetry regularizers. Evaluated on the public CBIS-DDSM dataset, SCL-AF achieves ROC-AUC 0.942, PR-AUC 0.692, and SEN 0.631, which outperform strong baselines. Gains concentrate in the clinically relevant high-specificity regime with particularly large improvements on calcification-dominant breasts. Ablations show that removing cross-view positives or contralateral negatives substantially degrades high-specificity sensitivity and calibration, lesion-guided tokens with diversity priors outperform global or randomly sampled tokens, and two layers of bidirectional attention offer the best accuracy and latency trade-off. These results suggest that encoding mammographic anatomy directly into representation learning and fusion yields significant improvements at operating points suitable for screening triage.

Bookmark

Self-Supervised Contrastive Learning With Attention Fusion for Enhanced Breast Cancer Diagnosis From Mammography

Key Points

Abstract

Cite This Study