What question did this study set out to answer?

This study aims to assess observer variability in identifying rectal tumors and develop a deep learning segmentation model.

June 14, 2026Open Access

Deep learning-based automatic segmentation of rectal tumors in endoscopic images

Key Points

This study aims to assess observer variability in identifying rectal tumors and develop a deep learning segmentation model.
Three expert annotators assessed 801 endoscopic images from 24 patients.
Inter-observer and intra-observer variability were evaluated with a nested cross-validation approach.
Four DeepLabV3 models were trained, including one on majority-vote contours for better performance.
Manual annotations had significant variability, with Dice scores of 0.36 for ulcers and 0.57 for proctitis versus 0.83 for tumors.
Intra-observer Dice scores averaged 0.72, 0.68, and 0.87 across annotators.
The majority-vote model achieved an average Dice of 0.77 but generated many false positives for ulcers and proctitis.

Abstract

PURPOSE: Endoscopy is critical in the identification of rectal tumors, but is prone to observer errors. The aim of this study was to assess the inter- and intra-observer variability in delineating rectal lesions in endoscopic images taken during high-dose-rate (HDR) brachytherapy and develop a deep learning-based automatic tumor segmentation model. MATERIALS AND METHODS: Three expert annotators identified tumors, scaring, ulcers and radiation proctitis in 801 endoscopic images from 24 patients. Inter-observer variability was evaluated at both whole-image and contour levels. Intra-observer variability was assessed by re-annotating 15 images from 14 patients after six months. Four DeepLabV3 models with a ResNet50 backbone were trained using a nested cross-validation approach: one per annotator and a fourth trained on majority-vote contours. Model performance was evaluated on 60 unseen images, which the annotators rated using a five-point Likert scale. RESULTS: Manual annotations showed significant variability for ulcers and radiation proctitis (average Dice: 0.36 and 0.57) versus tumors (0.83). Intra-observer Dice scores were 0.72, 0.68, and 0.87 across annotators. The majority-vote model outperformed individual annotator models (average Dice: 0.77) but generated many false positives, misclassifying ulcers and proctitis as tumors. Annotators generally rated the model trained on their own contours higher on the unseen test set. CONCLUSIONS: This work highlights the variability in expert annotations used as ground-truth for deep learning-based segmentation of rectal tumors in endoscopic images acquired during HDR brachytherapy. Automated contouring may provide a foundation for adaptive, AI-assisted brachytherapy workflows.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Thibodeau-Antonacci et al. (Mon,) studied this question.

synapsesocial.com/papers/6a2e44e4b1cc60ccdea8a4bc https://doi.org/https://doi.org/10.1016/j.brachy.2026.04.013

Bookmark

View Full Paper