Pneumonia is common following transthoracic esophagectomy (TTE). The diagnosis in the early postoperative period is partly subjective' and may vary among clinicians, with sensitivity reduced by post-operative inflammatory changes and altered thoracic anatomy. Despite its clinical importance, inter-observer agreement in diagnosing post-esophagectomy pneumonia has not been systematically evaluated. This retrospective study evaluated inter-observer variability among four senior specialists (surgeon, radiologist, intensivist, and pulmonologist) in diagnosing pneumonia from chest radiographs of 200 consecutive TTE patients. Using a web-based platform, blinded reviewers independently assessed anonymized chest radiographs from post-operative days 3 and 7 using a three-class scale (yes/no/maybe). When 'maybe' was selected, standardized clinical data (vital signs at 23:00 hours) were automatically provided. Cohen's kappa coefficient quantified pairwise agreement, while Fleiss' kappa assessed overall concordance. Of the 200 patients, pneumonia was documented in 54 (27%) per American Thoracic Society (ATS) criteria. Initial radiographic assessment showed fair inter-observer agreement (κ = 0.207-0.230) compared to the radiologist reference. Diagnostic uncertainty ('maybe' responses) occurred in 307/1600 assessments (19.2%), varying significantly by specialty: surgeon, 41.5%; radiologist, 22.0%; intensivist, 13.0%; pulmonologist, 14.0% (P < 0.001). After clinical correlation, agreement improved modestly: surgeon κ = 0.334, intensivist κ = 0.398, pulmonologist κ = 0.356 (all P < 0.001), but remained in the 'fair' range. Overall multi-rater agreement (Fleiss' κ) improved from 0.270 to 0.421 (+56% improvement, P < 0.001), transitioning from fair to moderate agreement. When clinical data points were made available for equivocal cases, 124 responders (40.4%) changed their initial assessment from 'maybe' to 'yes', indicating a preference for therapy, while 89 (29.0%) changed their unsure response to 'no'. Considerable inter-observer variability in pneumonia diagnosis after TTE exists, with 'fair' interrater agreement in documenting radiologic pneumonia, and 'poor' consistency in determining antibiotic use according to the ATS criteria. Current pneumonia diagnostic criteria are fundamentally limited by poor radiographic inter-observer agreement, indicating the timely need for standardization of definition terminology and supports the development of integrated diagnostic protocols.
Duff et al. (Fri,) studied this question.