Abstract Background Endoscopic assessment using the modified Rutgeerts score (mRS) is the gold standard for detecting postoperative recurrence (POR) in Crohn’s disease (CD), yet concerns persist about its reproducibility, particularly regarding the i2a/i2b distinction. Limited reliability threatens clinical decision-making and endpoint validity in clinical trials. Structured training programmes have improved the reproducibility of other IBD endoscopic indices, but evidence in postoperative CD is scarce. We aimed to (1) assess inter-observer agreement of the mRS among expert readers, and (2) evaluate the impact of a structured training programme on agreement among non-experts. Methods Fifty-five anonymised ileocolonoscopy videos from patients undergoing postoperative surveillance for CD were independently scored using the six-level mRS (i0–i4) by 17 readers: 8 experts (therapeutic endoscopists and IBD specialists) and 9 non-experts (gastroenterology trainees). Non-experts completed baseline scoring and a second scoring after a 90-minute structured virtual workshop focusing on anatomical recognition and differentiation of i2a vs i2b lesions. Inter-observer agreement (IRA) was assessed using Fleiss’ kappa (κ) for the full mRS and for clinically relevant thresholds (≥i2, ≥i2b, ≥i3). Results Among experts, IRA for the six-level mRS was fair (κ = 0.34), with therapeutic endoscopists demonstrating higher agreement than IBD physicians (κ = 0.47 vs 0.32; p = 0.03). Grouped thresholds showed improved agreement (κ = 0.56 for ≥i2; κ = 0.49 for ≥i2b; κ = 0.43 for ≥i3). Non-experts showed limited baseline agreement (κ = 0.29 for full mRS; κ = 0.44 for ≥i2b; κ = 0.34 for ≥i3). After training, agreement improved significantly for ≥i2b (κ = 0.54; p = 0.02) and ≥i3 (κ = 0.50; p = 0.007), with non-experts reaching expert-level agreement for grouped thresholds (κ = 0.62 for ≥i2; κ = 0.54 for ≥i2b; κ = 0.50 for ≥i3). Conclusion Reproducibility of the full mRS remains limited, but agreement improves substantially when using grouped clinical thresholds. A brief, structured training programme significantly enhances non-expert performance and aligns it with expert-level reliability. These findings support incorporating standardised training and reader calibration into clinical practice, central reading programmes, and trials evaluating postoperative Crohn’s disease recurrence. References: 1. Marteau P, Laharie D, Colombel JF, et al. Interobserver Variation Study of the Rutgeerts Score to Assess Endoscopic Recurrence after Surgery for Crohn’s Disease. J Crohns Colitis. 2016;10(9):1001-1005. doi:10.1093/ecco-jcc/jjw082 2. Rutgeerts P, Geboes K, Vantrappen G, Kerremans R, Coenegrachts JL, Coremans G. Natural history of recurrent Crohn’s disease at the ileocolonic anastomosis after curative surgery. Gut. 1984;25(6):665-672. doi:10.1136/gut.25.6.665 3. Rutgeerts P, Geboes K, Vantrappen G, Beyls J, Kerremans R, Hiele M. Predictability of the postoperative course of Crohn’s disease. Gastroenterology. 1990;99(4):956-963. doi:10.1016/0016-5085(90)90613-6 Conflict of interest: Alonso Lázaro, Noelia: No conflict of interest García García, Sonia: No conflict of interest Aguas, Peris: Bastida Paz, Guillermo: No conflict of interest Iborra, Marisa: No conflict of interest Mínguez, Alejandro: No conflict of interest Garrido Marín, Alejandro: No conflict of interest Bustamante Balén, Marco: No conflict of interest Argüello Viúdez, Lidia: No conflict of interest Pons Beltrán, Vicente: No conflict of interest Nos Mateu, Pilar: No conflict of interest
Lázaro et al. (Thu,) studied this question.