Accurate segmentation of breast tumors in dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) is essential for effective diagnosis, treatment planning, and monitoring of breast cancer. However, the high heterogeneity of tumor appearance and the complex spatiotemporal dynamics of contrast enhancement present critical challenges for existing segmentation methods. In this study, we propose a novel residual-guided spatiotemporal transformer with graph fusion enhancement (RST2G) framework for precise breast tumor segmentation in DCE-MRI. RST2G explicitly leverages pre-contrast MRI, post-contrast MRI, and their residual differences to capture rich inter-temporal kinetic information. Specifically, RST2G employs a weight-sharing hybrid encoder that combines convolutional neural networks and vision transformers to extract local and global features, followed by a residual-guided multi-scale refinement module to enhance feature discriminability. To effectively model spatial and temporal contextual dependencies, we construct modality-specific graphs and apply inter-slice and inter-temporal attention mechanisms for spatiotemporal graph enhancement. Extensive experiments on 2 publicly available breast DCE-MRI datasets demonstrate that RST2G significantly outperforms state-of-the-art 2-dimensional (2D), 3D, and 4D segmentation methods. Given its effectiveness in capturing complex spatiotemporal tumor characteristics for cancer annotation, RST2G has the potential to improve clinical breast cancer treatment.
Xie et al. (Thu,) studied this question.