To assess adherence to International Ovarian Tumor Analysis (IOTA) terminology in Danish routine clinical practice and to evaluate how non-adherence affects the diagnostic performance and calibration of the Assessment of Different NEoplasias in the adneXa (ADNEX) model and the Two-Step Strategy with modified benign descriptors (BD) with Cancer Antigen 125 (CA125). This prospective, multicenter cohort study included patients ≥18 years with adnexal masses across 14 gynecology departments and general gynecology practices. Reference standard was histopathology for surgically managed patients and clinical follow-up for conservatively managed patients. Ultrasound descriptions recorded at recruitment using IOTA terminology by examining clinicians with varying experience and IOTA certification were reviewed by three blinded IOTA-certified experts to identify deviations from IOTA definitions. Based on expert reassessment of stored representative images, obvious non-adherent terminology was corrected. Agreement in modified BD applicability was assessed using Cohen’s kappa (κ). Performance and calibration were compared using predictive values (10% threshold), area under the curve (AUC), prediction error, and calibration plots. Of 1,065 enrolled patients, 948 constituted the complete-case cohort. Non-adherence to IOTA terminology was identified in 198 (20.9%) clinician-recorded ultrasound descriptions. Agreement on the modified BD category was 90.0%, κ = 0.79. NPVs increased for both models (ADNEX: 95.7% to 96.8%; Two-Step: 95.4% to 96.9%), as did PPVs (ADNEX: 48.6% to 52.0%; Two-Step: 50.4% to 52.8%) and AUCs (ADNEX: 90.8% to 93.4%; Two-Step: 90.3% to 93.6%). Prediction error decreased, while overall calibration remained unchanged. Non-adherence to IOTA terminology is a potential barrier to successful implementation and highlights the need for strategies that promote consistent use of standardized IOTA terminology. Expert reassessment was based on stored still ultrasound images and may not fully capture dynamic features of real-time examination.
Karlsen et al. (Wed,) studied this question.