Within-subject scores for depression screening instruments including BDI, HADS, and PHQ-9 showed excellent agreement over periods of up to two weeks, with pooled intraclass correlations ranging from 0.84 to 0.89.
Meta-Analysis
What is the within-individual variation of measured depression symptoms using standard screening instruments over time?
Depression screening instruments (BDI, HADS, PHQ-9) show excellent short-term within-subject reliability, but data on long-term variability and in patients with clinical depression are lacking.
Effect estimate: ICC 0.84 (95% CI 0.81-0.88)
Depression screening instruments are commonly used to assess the presence and severity of depression symptoms. However, there is little information on variation in depression screening scores over time within the same individual. A systematic review and meta-analysis was performed of studies reporting the within-subject variability of the Beck's Depression Inventory (BDI), Hospital Anxiety and Depression Scale (HADS), Patient Health Questionnaire-9 (PHQ-9) and Patient Health Questionnaire-2 (PHQ-2). Multiple databases were searched from inception to 14th July 2023. Title and abstract screening was performed in duplicate, full text screening and data extraction by one reviewer and verified by a second. Risk of bias was assessed with a modified COSMIN tool. Of 2798 titles and abstracts and 157 full text articles screened, 41 met the inclusion criteria. No studies were on patients with depression, most had only two measurements, less than two weeks apart. The pooled estimates of ICCs (Intraclass Correlations) and 95% confidence intervals for BDI, HADS, PHQ-9 and PHQ-2 were 0.89 (0.84-0.93), 0.89 (0.85-0.93), 0.84 (0.81-0.88) and 0.75 (0.56-0.93), respectively.Studies in healthy subgroups showed lower ICC than those with physical illness. Assessment of variability is not the main aim of most papers assessing depression measurement instruments, so it is possible that some relevant papers have been missed. Within-subject scores for BDI, HADS and PHQ-9, show generally excellent agreement over periods of up to two weeks. However published data on within-subject variability is lacking over longer time periods and for patients with depression.
Gough et al. (Tue,) conducted a meta-analysis in Depression symptoms. Depression screening instruments (PHQ-9, PHQ-2, BDI, HADS) was evaluated on Pooled Intraclass Correlation (ICC) for PHQ-9 (ICC 0.84, 95% CI 0.81-0.88). Within-subject scores for depression screening instruments including BDI, HADS, and PHQ-9 showed excellent agreement over periods of up to two weeks, with pooled intraclass correlations ranging from 0.84 to 0.89.