Abstract Deep learning (DL) has been proposed for magnetic resonance imaging (MRI) prostate segmentation for various clinical tasks, including radiotherapy treatment planning. In other applications, DL models have exhibited performance bias by protected attributes such as race. To investigate possible race bias in prostate MRI segmentation, DL models were trained on five clinical T2-weighted MRI datasets with varying White/Black race imbalance, plus one public dataset with unknown races, and evaluated on 32 White/Black matched clinical subjects. For the models trained with differing levels of race imbalance, the best performance for both races was when the training set was race-balanced. A linear mixed-effects model analysis showed that Dice Similarity Coefficient (DSC) differences between Black and White subjects depended on race representation in the training data, with a slight reduction in White-Black performance gap as Black representation increased (p < 0.05). The model trained on public data showed no difference in performance between races for DSC. The findings reveal the potential for race bias in DL prostate MRI segmentation performance when training sets are highly imbalanced. We argue for transparency in race reporting in DL prostate segmentation training data and reporting of test performance across demographic groups, with appropriate ethical/legal safeguards.
Alqarni et al. (Sat,) studied this question.