Abstract Background/Introduction Recent advancements in deep learning have enabled the prospect of patient-specific cardiac digital twins, which can provide a visual aid to improve patient-doctor communication and understanding. To reconstruct a 3D cardiac replica, patient-specific information is first required. In this research, we focus on extracting patient heart information from their cardiac MRI (CMRI) scans. Purpose Automated image segmentation is a useful tool for such a task. We propose to leverage the generalizing ability of foundation models for segmentation. This research is part of an ongoing process to develop an automated segmentation pipeline, where its outputs will be utilized to reconstruct patient-specific 3D cardiac models. Methods Segmentation is a complex problem involving object localization and per-pixel classification. MedSAM, a semi-automatic foundation model by Ma et al.(2024) reduces this complexity by allowing users input to localize objects. We propose to further simplify and automate this process by implementing a two-stage detection pipeline. First is the cardiac-detection stage, then followed by the component-detection stage to localize the left ventricle cavity (LVC), LV myocardium (MYO) and right ventricle (RV). The output are bounding box coordinates that localize cardiac components, replacing the user input required by MedSAM for segmentation. Both detection models are yolov5 trained on the ACDC cardiac dataset by MICCAI. The bounding box coordinates of cardiac and its components are generated from the ground truth labels. A train-to-test ratio of 1,841:1,001 is used for CMRI at both end-diastole (ED) and end-systole (ES) during training and testing. Results Currently, the cardiac-detection stage yields an IOU of 90.8%, while component detection yielded IOUs of 77.6%, 63.2% and 81.9% for LVC, MYO and RV respectively. Overall, the pipeline resulted in final segmentation DSCs of 81.7%, 54.0% and 85.4%, for LVC, MYO and RV. It is observed that both detection and segmentation of MYO is notably low compared LVC and RV. Component surface area plays a role in segmentation performance, as DSCs of LVC and RV are higher at ES when the components exhibit larger surface area such. MYO appears thinner at ED, and thus the model could not accurately detect its presence. Furthermore, it is observed that inverting the image colours and increasing contrast increases segmentation performance. Conclusion This study proves there is potential to apply foundation models to segment CMRI. Once the bottleneck at the component-detection stage is resolved, overall accuracy is expected to increase. Future work will aim to improve the proposed pipeline performance by prompt-tuning the input data. Based on the change in segmented cardiac components across time, the 3D patient-specific can be reconstructed. This enables a communicative tool to enhance patient comprehension regarding their own heart.Workflow diagram
Then et al. (Sat,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: