Approximately 25% of the measurement variation made by community radiology can be attributed to error. Inaccuracies between community radiology and tertiary care may have a significant impact on the timely presentation of AIS patients. The study objective was to evaluate the effectiveness of a machine learning (ML) model to quantify curve magnitude on community-acquired spine radiographs and to subsequently identify AIS patients with moderate and severe deformity for triage. A retrospective review of AIS patients (n=116) at a tertiary-care pediatric hospital with community-acquired spine radiographs was conducted. Reference standard evaluations were obtained on community-acquired spine radiographs by independent measurements from two blinded raters (orthopaedic spine specialist; paediatric radiologist). The ML model was a two-step segmentation-based deep learning architecture, previously validated on 3-foot standing spine radiographs. Community radiology readings were retrieved from imaging reports. Cobb angle readings obtained by the ML model and by community radiologists were compared to the reference standard. The agreement was computed using intraclass correlation coefficient (ICC). Brace and surgical candidates were identified by reference standard and corresponding Scoliosis Research Society management categories (>25°, >50°). Figure 1 illustrates the two-step deep learning architecture. First, the model generates segmentation and a minimum bounding box for each vertebra. Second, landmarks are produced from the minimum bounding box, which are ultimately used by the model to calculate curve magnitude. On community-acquired spine radiographs, the agreement in Cobb angle readings between ML model and reference standard was excellent (ICC=0.93 95% CI 0.90–0.95; ICC=0.90 95% CI 0.86–0.93) with acceptable precision (SEM=3.97°; SEM=4.79°). Comparatively, the agreement between community radiologist and reference standard was fair (ICC=0.68 95% CI 0.51–0.79; ICC=0.67 95% CI 0.54–0.76) with a greater margin of error (SEM=8.28°; SEM=8.85°). The ML model correctly identified 85.4% of brace candidates (n=49) and 83.9% of surgical candidates (n=31), increased from 63.2% and 67.7% identified by community radiologist. Of those brace candidates that were missed, the ML model underestimated curve magnitude by 3.84°, on average. There was greater reliability of the Cobb angle readings obtained by the ML model compared to manual measurements obtained by community radiologists. There may be important clinical utility of an ML model to enhance measurements obtained in the community, which may expedite appropriate referrals to tertiary care. For any figures or tables, please contact the authors directly.
Hadi et al. (Wed,) studied this question.