Systematic review and random-effects meta-analysis of PHQ-9 factor structure, measurement invariance, and internal consistency. # Factor structure and measurement invariance of the Patient Health Questionnaire-9 (PHQ-9): a systematic review and random-effects meta-analysis ## Background and rationale The Patient Health Questionnaire-9 (PHQ-9) is a widely used 9-item self-report depression instrument. Despite its ubiquity in primary care, epidemiologic surveys, digital health, and implementation research, its latent factor structure remains contested across one-factor (1F) general-depression, two-factor (2F) correlated cognitive-affective vs. somatic, and hierarchical or bifactor specifications. These disagreements affect score interpretation, the defensibility of subscale reporting, measurement invariance claims, and cross-population/language comparability of PHQ-9 totals. No recent transparent meta-analysis pools field-wide fit and reliability statistics or formally tests whether different model classes diverge. ## Objectives We aim to (1) pool five widely reported fit and reliability indices (CFI, TLI, RMSEA, SRMR, Cronbach's α) for the PHQ-9 from a conservatively curated structural-validation evidence base, (2) test whether one-factor and two-factor specifications diverge using Cochran's Q-between and a Wald-equivalent meta-regression with model class as a binary moderator, and (3) apply a COSMIN-inspired abstract-level methodological-quality screen with a sensitivity re-pool that excludes Inadequate-rated studies. ## Eligibility criteria (PICOS) - **Population: ** Adult populations (general or clinical) administered the standard 9-item PHQ. - **Intervention/Exposure: ** Not applicable (psychometric / structural validity review). - **Comparator: ** Across model classes (1F vs. 2F correlated vs. hierarchical/bifactor). - **Outcomes: ** Confirmatory factor analysis fit indices (CFI, TLI, RMSEA, SRMR) and internal consistency (Cronbach's α). - **Study designs: ** Original empirical structural-validation studies reporting factor-analytic or psychometric evidence on the standard 9-item PHQ. - **Time horizon: ** Records published from January 1, 2010 onward. - **Language: ** English-language reports (with Korean and Spanish reports retained where verifiable abstracts/full texts are obtainable). ## Search strategy **v1. 5 baseline (already executed): ** Single-database PubMed search (Jan 1, 2010 to April 17, 2026) using free-text terms targeting "PHQ-9", "Patient Health Questionnaire-9", and factor-analytic/psychometric concepts; no MeSH-controlled vocabulary; no second reviewer; no grey-literature search; no trial-registry query. Yield: 584 records. **v2 multi-database expansion (planned, library-pending): ** PRISMA-S-compliant search of Embase, OVID PsycINFO, CINAHL (EBSCO), Education Source, APA PsycNet, Cochrane CENTRAL, LILACS, Scopus, and regional indexes (KoreaMed, SciELO). Library-mediated database access requested (2026-04-18 ~ 2026-04-27). **Phase 1·5 supplementary scoping cross-check (already executed): ** Preprint-repository searches across bioRxiv/medRxiv and supplementary academic indexes. Yielded 18 additional candidates → 9 PubMed-verified PMID lock-ins (Kliem 2024 PMID 39726913; Doi 2018 PMID 30024876; Tibubos/Beutel 2021 PMID 33952234; Rosario 2023 PMID 36865076; Rahman/Mehareen 2022 PMID 35675375; Shin 2020 PMID 32354339; Vu 2022 PMID 35990070; Hall 2020 PMID 33248710 paywall, ILL pending; Lamela 2020 PMID 32697702 paywall, ILL pending). ## Screening and data extraction **v1. 5 baseline (executed): ** Single-reviewer screening at title/abstract level (no inter-rater reliability statistics). Round 1 (N=584): INCLUDE 369 / BORDERLINE 121 / EXCLUDE 94. Round 2 (N=369): INCLUDEMETA 47 / INCLUDESEED 16 / INCLUDEQUAL 157 / EXCLUDE 149. Final extraction set: 67 records (post seed-merge). Manuscript-stage curation (subjective, manual judgment based on direct relevance to PHQ-9 latent dimensional structure) retained 33 PHQ-9-focused structural-validation studies (the curated PMID list). **v2 (planned): ** Dual-reviewer formal screening with full inter-rater reliability statistics (Cohen's κ) ; discrepancies resolved by consensus or third-reviewer adjudication. **Mega-sample exclusion rule: ** Two studies with N>100, 000 (Flores-Cohaila 2026, PMID 41619940, N=318, 681 administrative-registry; Nouwen 2021, PMID 34139403, N=159, 801 analyst-summed multi-cohort aggregate) were excluded from the primary REML pool because the vᵢ = 1/N sampling-variance approximation has no coherent sampling interpretation in either case. **Important transparency note: ** This rule was finalised during analysis rather than pre-specified before the search, and should be read as a transparent post-hoc analytic decision. ## Quantitative synthesis Studies with N ≥ 17 and at least one extractable fit or reliability statistic are eligible for pooling (the N ≥ 17 floor is a pragmatic cutoff chosen to keep the Fisher-z α transformation numerically stable at vᵢ = 1/ (N-3) ; flagged as a reviewer-noted arbitrary threshold rather than a principled power analysis). Random-effects meta-analyses are fitted using REML estimation: - **CFI and TLI: ** direct scale; sampling variance approximated as vᵢ = 1/N. - **RMSEA and SRMR: ** direct scale; vᵢ = 1/N. - **Cronbach's α: ** Fisher-z transform; vᵢ = 1/ (N−3) ; back-transformed via tanh. τ² estimated via REML. Pooled 95% CIs computed using standard normal-approximation Wald intervals. Heterogeneity quantified using Cochran's Q and I². The conventional acceptable ranges used for narrative interpretation (CFI/TLI ≥0·95, RMSEA ≤0·06, SRMR ≤0·08) follow the widely cited Hu (ii) Cochran's Q-between with a Wald-equivalent fixed-effect meta-regression treating model class as a binary moderator (1F=0, 2F=1; α analysed on Fisher-z scale) ; (iii) leave-one-out sensitivity recomputation for each of the five primary indices. ## Risk of bias **v1. 5 baseline: ** No full-text-based risk-of-bias appraisal (COSMIN full checklist, ROBIS, or AXIS). Four abstract-level COSMIN-inspired signals (sample-size adequacy, model-structure disclosure, fit-index reporting, reliability reporting) applied to all 33 curated studies, with a sensitivity re-pool excluding Inadequate-rated studies. **v2 (planned): ** Full COSMIN Risk of Bias checklist for measurement properties, applied to full-text records by two independent reviewers. ## Pre-existing analytic baseline (v1. 5, already completed) This registration documents an **upgrade** of an existing v1. 5 rapid review to a fully PRISMA-S-compliant, multi-database, dual-reviewer systematic review (v2). The v1. 5 rapid review is preserved as a separate transparency artifact and will be cited in the v2 manuscript. **All v1. 5 numerical results below are pre-existing and fixed at the time of this registration: ** | Index | Pooled estimate (95% CI) | k | I² | |---|---|---|---| | CFI | 0·966 (0·949 to 0·983) | 9 | 87·0% | | TLI | 0·957 (0·925 to 0·989) | 4 | 67·0% | | RMSEA | 0·066 (0·050 to 0·081) | 11 | 82·5% | | SRMR | 0·041 (0·031 to 0·051) | 5 | 0·0% | | Cronbach's α | 0·834 (0·742 to 0·895) | 6 | 99·4% | A Phase 2 pre-registration simulation with 9 additional PubMed-verified candidates yielded a TLI shift of −0·021 (RMSEA +0·007, α +0·017), driven primarily by the Kliem 2024 outlier (RMSEA = 0·17). The qualitative conclusion that "the PHQ-9 is structurally credible across adult populations" is robust across the v1. 5 baseline and the Phase 2 simulation pool. ## Anticipated timeline - **2026-04-30: ** OSF Open-Ended Registration submitted (this registration). - **2026-05-01 onward: ** Library-mediated multi-database search execution (Embase, PsycINFO, CINAHL, Cochrane CENTRAL, LILACS, Scopus, KoreaMed, SciELO). - **2026-05–06: ** Dual-reviewer screening (Cohen's κ ≥ 0·70 target). - **2026-06–07: ** Full-text COSMIN Risk of Bias checklist application. - **2026-07-31: ** Anticipated v2 manuscript submission to peer-reviewed journal (target: Lancet Psychiatry; alternative targets: BMJ Open, Journal of Affective Disorders). ## Funding and conflicts of interest **Funding: ** None. This systematic review is conducted without dedicated external funding. The corresponding author's institutional affiliation provides standard library and database access. **Conflicts of interest: ** The reviewers declare no competing interests. None of the authors of the curated primary studies is a member of the review team. ## Data availability Data and supporting materials will be made available on OSF following peer-review submission. ## Provisional author list (transparency) The full provisional author list comprises seven contributors, in the following order: (1) Jung Moses Koo (Oxford Department of International Development, Wolfson College, Oxford University, UK) ; (2) Yong-Tae Kwak (Department of Neurology, Yong-In Hyo-Ja Hospital, Gyunggi-Do, South Korea) ; (3) Sun-Hyun Kim (Department of Family Medicine, Catholic Kwandong University International St. Mary's Hospital, Incheon, South Korea) ; (4) Jin-Yong Jun (Department of Psychiatry, College of Medicine, Ulsan University, Ulsan, South Korea) ; (5) Eunju Kim (Delaware Department of Health and Social Services, Delaware Psychiatric Center, Delaware, USA) ; (6) Min-Seong Koo (Department of Psychiatry, Catholic Kwandong University International St. Mary's Hospital, Incheon, & College of Medicine, Catholic Kwandong University, Gangwon, South Korea — corresponding author and guarantor). The initial OSF registration is filed by three confirmed contributors (Jung Moses Koo, Yong-Tae Kwak, Min-Seong Koo) who jointly lead the PRISMA review process. The remaining provisional contributors (Sun-Hyun Kim, Jin-Yong Jun, Eunju Kim) will be added as OSF Contributors after individual confirmation. ## Registration history an
Building similarity graph...
Analyzing shared references across papers
Loading...
Min Seong Koo
Catholic Kwandong University
Yong Tae Kwak
Hyoja Geriatric Hospital
Jung Moses Koo
University of Oxford
Building similarity graph...
Analyzing shared references across papers
Loading...
Koo et al. (Thu,) studied this question.
synapsesocial.com/papers/69f9895b15588823dae184bc — DOI: https://doi.org/10.17605/osf.io/z9q2x