AI-assisted clinically validated annotations improved echocardiographic segmentation model performance, yielding a 14.7 times faster validation process compared to full manual annotation.
Does an AI-assisted label refining approach improve the segmentation performance of echocardiography models compared to using all available manual labels?
An AI-assisted label refining approach significantly improves echocardiographic segmentation model performance and annotation efficiency compared to standard manual labeling.
Tasa de eventos absoluta: 0% vs 0%
Abstract Introduction Artificial intelligence (AI) has shown promising results in echocardiographic segmentation 1, yet its performance remains highly dependent on the quality of training data 2. Precise, consistent annotations are essential to develop accurate models, particularly when targeting complex cardiac structures. While model architectures continue to evolve, improvements in label quality may offer a more direct path to performance gains. Purpose This study aims to evaluate how refined annotations affect the accuracy of segmentation models in echocardiography. Methods A total of 698 studies were retrospectively collected from 633 patients (mean age 64.2 ± 12.6 years; 60% male). Endocardial borders (atria and ventricles) and left ventricular epicardium were manually delineated by a cohort of 21 sonographers (mean experience 5 ± 3 years) in apical 4-chamber view at end-diastole and end-systole, yielding 1910 annotated frames. Each frame was then processed by an AI model to generate AI-based segmentations. Finally, a subgroup of expert sonographers conducted a blinded comparison of manual and AI-generated labels, selecting the clinically superior annotation or rejecting both if unacceptable, resulting in a clinically validated set. Two nnU-Net-based models 3 were trained: one using all available manual labels (Model A), and another using only the clinically validated set (Model B). Model performance was assessed using Dice similarity coefficient across 3 independent test sets: (i) manual labels; (ii) clinically validated labels (manual and AI-generated); and (iii) a subset of the validated labels containing only manual annotations, to exclude potential bias from AI-generated labels (Figure 1). Results Model B outperformed Model A in any scenario (p 0.001) and any structure (Table 1). This highlights the strong influence of label quality on model performance, even when training data quantity is reduced (-12%). The herein proposed refining approach, which incorporates clinically validated AI-generated labels, proved not only effective but also highly efficient. Validating labels was found to be 14.7 times faster than annotating them again from scratch, significantly reducing the time and effort required to produce high-quality training data. Both Model A and Model B showed improved performance on the clinically validated test set, suggesting that current evaluations based on labels that have not been cross-validated may underestimate true model performance. Conclusion This study demonstrates that the quality of labels has a significant impact on echocardiography. The proposed AI-assisted labelling pipeline not only improved performance but also offered a time-efficient alternative to full manual annotation. These findings highlight that improving AI tools is not solely a matter of model architecture, but of rethinking the entire ecosystem, where smarter workflows can yield smarter models.Figure 1 Table 1
Garcia-Sineriz et al. (Thu,) reported a other. AI-assisted clinically validated annotations improved echocardiographic segmentation model performance, yielding a 14.7 times faster validation process compared to full manual annotation.