Abstract Purpose The primary aim was to compare the variability and repeatability of medial posterior tibial slope (MPTS) measurements on lateral knee radiographs between osteoarthritic (OA) and non‐OA knees, and the secondary aim was to assess their implications for clinical classification. Methods Lateral knee radiographs from a retrospective institutional database were reviewed, including patients with and without radiographic OA. Only radiographs fulfilling strict quality criteria for MPTS measurement were included. MPTS was measured by two observers at two time points using the method of Dejour and Bonnin. Inter‐ and intra‐rater reliability were assessed with intra‐class correlation coefficients (ICC), and agreement was evaluated with Bland–Altman analysis. Comparisons between groups were performed using Fisher's Z ‐transformation. Misclassification across the 10° clinical threshold was analysed within and between raters. Results A total of 160 radiographs met the inclusion criteria, comprising 80 OA knees (mean age 73 ± 10 years, 45% female) and 80 non‐OA knees (mean age 39 ± 16 years, 48% female). Variability of MPTS was higher in OA knees (10.6° ± 5.1°; confidence interval CI: 9.8°–11.4°) than in non‐OA knees (9.3° ± 2.6°; CI: 8.9°–9.7°). ICCs were significantly lower in OA knees for both inter‐rater (0.79 vs. 0.89, p < 0.001) and intra‐rater reliability (0.92/0.84 vs. 0.99/0.96, p < 0.001). Bland–Altman analysis demonstrated wider limits of agreement in the OA group (–5.1° to 8.3°) versus controls (–3.9° to 2.3°). The two raters disagreed on whether MPTS exceeded 10° in 31.3% of OA knees and 16.3% of non‐OA knees, while intra‐rater disagreement between the first and second readings was 10.0% (OA) and 5.0% (non‐OA) for rater 1, and 13.8% and 3.8% for rater 2, respectively. Conclusion MPTS measurement on lateral radiographs showed increased variability and reduced repeatability in OA knees. Surgeons should exercise caution when applying fixed thresholds on radiographs of OA knees, as small variations may lead to misclassification and influence decision‐making. Level of Evidence Level III, diagnostic study.
Kayali et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: