ABSTRACT Multi‐modal medical image segmentation plays a pivotal role in the diagnosis, treatment planning, and monitoring of focal liver lesions (FLLs). However, obtaining sufficient multimodal labels is challenging due to the high cost and labor intensity of manual annotation. Additionally, spatial misalignment between modalities acquired at different times or from diverse sources can hinder accurate segmentation. To address these issues, we propose automated MRI focal liver lesion segmentation based on multimodal alignment and interaction, which contains an U nsupervised C ross‐ M odal I nteraction based R egistration (UCMIR) module and a M ulti‐ S cale M odality‐ C ontribution‐ A ware multimodal medical image segmentation network (MSMCA). UCMIR performs multiscale cross‐modal feature interaction and registration to align unlabeled modalities with labeled ones, generating a deformation field for medical image registration. The aligned image pairs are then fed into the MSMCA network to obtain the final segmentation result. MSMCA effectively fuses multiscale information from different modalities through coordinate attention, boosting segmentation performance. Experimental results on a focal liver lesions dataset demonstrate that the Dice values of our approach achieve 7.34% and 4.40% improvement compared with Tri‐Attention Net in two testing modality groups, respectively.
Wang et al. (Sun,) studied this question.