What question did this study set out to answer?

The aim is to enhance the diagnosis of prostate cancer by developing a reliable method that minimizes the need for expert interpretation of histopathology slides.

January 25, 2026Open Access

Transformer-Driven Semi-Supervised Learning for Prostate Cancer Histopathology: A DINOv2–TransUNet Framework

Key Points

The aim is to enhance the diagnosis of prostate cancer by developing a reliable method that minimizes the need for expert interpretation of histopathology slides.
Developed a semi-supervised learning technique integrating transformer representation learning with a TransUNet classifier.
Pretrained DINOv2 on 10,000 unlabeled prostate tissue patches to capture morphological structures.
Implemented a CNN-based decoder with residual upsampling and skip connections for feature merging.
Applied a consistency-driven learning method for reliable predictions across data augmentations.
Achieved precision of 91.81% and recall of 89.02%.
Attained overall accuracy of 93.78% on an additional test set.
Surpassed performance of conventional U-Net and baseline encoder–decoder networks.

Abstract

Prostate cancer is diagnosed through a comprehensive study of histopathology slides, which takes time and requires professional interpretation. To minimize this load, we developed a semi-supervised learning technique that combines transformer-based representation learning and a custom TransUNet classifier. To capture a wide range of morphological structures without manual annotation, our method pretrains DINOv2 on 10,000 unlabeled prostate tissue patches. After receiving the transformer-derived features, a bespoke CNN-based decoder uses residual upsampling and carefully constructed skip connections to merge data from many spatial scales. Expert pathologists identified only 20% of the patches in the whole dataset; the remaining unlabeled samples were contributed by using a consistency-driven learning method that promoted reliable predictions across various augmentations. The model received precision and recall scores of 91.81% and 89.02%, respectively, and an accuracy of 93.78% on an additional test set. These results exceed the performance of a conventional U-Net and a baseline encoder–decoder network. All things considered, the localized CNN (Convolutional Neural Network) decoding and global transformer attention provide a reliable method for prostate cancer classification in situations with little annotated data.

Transformer-Driven Semi-Supervised Learning for Prostate Cancer Histopathology: A DINOv2–TransUNet Framework

Key Points

Abstract

Cite This Study