Borish et al. provided an excellent review highlighting methodological variability among the six pivotal Phase III trials 1. One critical yet overlooked issue is the inconsistent application of the endoscopic nasal polyp score (NPS), a primary outcome measure in all trials. Numerous grading systems have been developed to assess nasal polyposis. Attempts to capture the three-dimensional bulk of nasal polyposis have proven impractical due to poor examiner consistency and time demands 2. Consequently, unidimensional grading scales continue to be the most widely used. However, their inability to directly measure polyp volume makes them nonlinear, resulting in unequal reductions across grades 3. This is also of therapeutic significance, as there is a varying therapeutic response across different grades, which becomes even more evident with the improved prediction of outcome measures when patients are stratified based on baseline NPS 4. Furthermore, the small dynamic range of possible polyp scores may lead to a greater likelihood of “floor” and “ceiling” effects, as described by Djupesland et al., limiting the ability to detect further changes at the extremes of disease severity 3. Experts from the European Academy of Allergy and Clinical Immunology (EAACI) in 2019 proposed the total NPS as a universally accepted reference standard to improve interpretability across trials and enhance the overall quality of research 5. Despite all efforts, the correlation between polyp scoring and symptom severity, quality of life, and olfactory function remains weak. A meta-analysis of 55 studies showed that nasal polyp grading systems do not reliably correlate with PROMs or olfactory outcomes 6. Djupesland et al. suggested reporting polyp grading both unilaterally and in total to reduce floor and ceiling effects in the reported results 3. Nasal polyp size is also not the primary outcome of concern to surgeons—the predictability of contemporary sinus surgery taking polyps from a 4 to a 0 is excellent. Simply focusing on polyp size also misses many of the nuances of endoscopic evaluation (e.g., generalized edema, olfactory cleft edema, mucin) that may well better correlate with patient symptomatology. Scoring systems utilizing these “soft” findings have been proposed but can be cumbersome and introduce additional potential variability. Ultimately, NPS remains a primary outcome for clinical trials, and will remain so secondary to the precedent set in early clinical trials and requirements by licensing bodies. The consequence may be that even when statistically significant NPS reductions are reported, these may not be equivalent across trials, making indirect drug comparisons unreliable. The EAACI position paper on nasal polyp scoring has done an excellent job in attempting to create uniformity with scoring between trials by simplifying scoring, stating that polyposis cannot be scored a 3 without also concurrently meeting criteria for a polyp score of 2 5. Addressing the variability encountered with the initial Phase III clinical trials by retrospectively applying these grading criteria is likely not feasible without access to raw data. Future trials may be enhanced by application of emerging volumetric methods, including CT-based segmentation and artificial intelligence tools, which could overcome the limitations of unidimensional scales. The authors gratefully acknowledge the contributions of Claire Hopkins, whose careful review and constructive feedback during the proofreading process greatly improved the background, clarity, and quality of this manuscript. The authors have nothing to report. Leigh J Sowerby—advisory board-GSK, Sanofi, AstraZeneca; honorarium-GSK, Sanofi, AstraZeneca; Research support-GSK, Sanofi, Roche, AstraZeneca, Insomed, Eli Lilly. Joseph Khristian Han—research consultant-GSK, Sanofi, Regeneron, Astra Zeneca, Optinose. All other authors declare no conflicts of interest.
Sowerby et al. (Mon,) studied this question.