Original article
Detecting and monitoring depression with a two-item questionnaire (PHQ-2)

https://doi.org/10.1016/j.jpsychores.2004.09.006Get rights and content

Abstract

Objective

This study evaluates the two-item Patient Health Questionnaire (PHQ-2) as a measure for diagnosing and monitoring depression.

Methods

We assessed construct validity in a cross-sectional sample of 1619 medical outpatients (mean age 43±14 years, 64% female) by comparing the PHQ-2 to four longer self-report questionnaires. Criterion validity was established in a subsample of 520 participants with reference to the Structured Clinical Interview for DSM-IV (SCID). Sensitivity to change was investigated in a prospective study of 167 patients who completed the SCID both at baseline and the 1-year follow-up.

Results

With reference to the SCID, the PHQ-2 had a sensitivity of 87% and a specificity of 78% for major depressive disorder and a sensitivity of 79% and a specificity of 86% for any depressive disorder. Its diagnostic performance was comparable with that of longer depression scales. PHQ-2 change scores accurately reflected improved, unchanged, and deteriorated depression outcomes.

Conclusion

The PHQ-2 performed favorably with respect to a standard diagnostic interview, as well as established depression scales and proved sensitive to change. Thus, the PHQ-2 appears promising as a brief multipurpose measure for detecting depression, grading its severity, and monitoring outcomes over time.

Introduction

With the lifetime prevalence of major depressive disorder being as high as 16% and appropriate treatment rates being as low as 22% [1], improving depression diagnosis, follow-up, and treatment remains a health care priority, especially in the general medical setting. The U.S. Preventive Task Force recently provided evidence-based recommendations regarding screening adults for depression in clinical practices that have systems in place to assure accurate diagnosis, effective treatment, and follow-up [2]. In research, depression is frequently measured as a primary or secondary outcome. Due to its effects on adherence, disability, and mortality, depression also is a potential confounder that needs to be controlled for in many clinical trials.

Brief self-report questionnaires have been advocated for depression screening in primary care [2]. Although multiple studies have shown that available depression screeners are generally comparable [3], [4], [5], recent work has suggested that the nine-item depression scale of the Patient Health Questionnaire (PHQ-9; [6], [7]) may have superior operating characteristics relative to several other depression screeners [8], [9]. Similarly, superior operating characteristics have been established for the panic module of the PHQ [10]. Another strength of the PHQ-9 is its proven sensitivity to change [11], [12]. Finally, PHQ diagnostic algorithms are not overinclusive but result in realistic estimates of base rates for depressive disorders [13].

Nevertheless, in clinical practice, as well as in research, depression is often not the only condition that needs to be screened for. Consequently, even shorter measures may be desirable. The initial evidence that two items might be sufficient for depression screening was provided for the two items on depressed mood and loss of interest from the Primary Care Evaluation of Mental Disorders (PRIME-MD) Screening Questionnaire [4], [14]. These findings were recently replicated for the same two items asked verbally by general practitioners [15]. However, due to its dichotomous (yes/no) response format, this screener is not suitable for grading depression severity or for assessing depression change over time. A measure that might overcome these shortcomings is the two-item PHQ (PHQ-2; see Appendix A), which has a four-point response format for each of its two items, with total scores ranging from 0 to 6 [16]. The only PHQ-2 validation study so far used the PRIME-MD as the criterion standard for depressive disorders [16]. However, as the PRIME-MD includes the two items of the PHQ-2, both measures are not completely independent. Thus, comparison with an independent diagnostic interview would strengthen the diagnostic validity of the PHQ-2. In addition, it remains uncertain how strongly PHQ-2 scores correlate with scores from longer depression self-report scales. Finally, sensitivity to change of the PHQ-2 has not yet been established, which is a precondition if the PHQ-2 is to be used to assess depression outcome.

Therefore, this study investigated the psychometric characteristics of the PHQ-2 as a brief measure for depression diagnosis and follow-up. Specifically, the reliability, construct and criterion validity, and sensitivity to change of the PHQ-2 were evaluated.

Section snippets

Participants

First, to investigate the validity of the PHQ-2 with respect to an independent diagnostic interview and established depression scales, data from a cross-sectional study were analyzed. Second, to assess sensitivity to change of the PHQ-2, results from a 1-year follow-up study were used, which included a predefined subgroup of patients from the cross-sectional study. Both studies were also instrumental in establishing the criterion validity and sensitivity to change of the PHQ-9 [8], [11]. Thus,

Patient characteristics

The baseline characteristics of the participants participating in the cross-sectional study are summarized in Table 1. In the total sample, mean age was 43.4 years, with 63.6% being female. At baseline, major depressive disorder, as diagnosed with the SCID, was present in 71 (13.6%) patients, and 132 (25.4%) patients were diagnosed as having any depressive disorder. The most common physical diagnoses according to ICD-10 were diseases of the musculoskeletal system and connective tissue (21%),

Discussion

Our study findings suggest that the PHQ-2 is not only a practical but also a valid tool to assess depression diagnosis, severity, and outcome. Comprehensive assessment established the reliability, construct and criterion validity, and sensitivity to change of the PHQ-2. Comparison with the seven-item HADS and the WBI-5 revealed that the PHQ-2 is evenly matched with longer depression screeners. The strong association of the PHQ-2 score with depression scores of three other questionnaires, the

Acknowledgments

This paper was supported by a research award from the Max-Kade Foundation, New York, to Dr. Löwe. The German version of the PHQ was originally developed with an unrestricted grant from Pfizer, Germany, and an additional research grant from the medical faculty of the University of Heidelberg, Germany (Project 121/2000). There are no conflicts of interest in connection with this paper.

References (44)

  • JG Wright et al.

    A comparison of different indices of responsiveness

    J Clin Epidemiol

    (1997)
  • RC Kessler et al.

    The epidemiology of major depressive disorder: results from the National Comorbidity Survey Replication (NCS-R)

    JAMA

    (2003)
  • Screening for depression: recommendations and rationale

    Ann Intern Med

    (2002)
  • CD Mulrow et al.

    Case-finding instruments for depression in primary care settings

    Ann Intern Med

    (1995)
  • MA Whooley et al.

    Case-finding instruments for depression. Two questions are as good as many

    J Gen Intern Med

    (1997)
  • RL Spitzer et al.

    Validation and utility of a self-report version of PRIME-MD: the PHQ primary care study

    JAMA

    (1999)
  • K Kroenke et al.

    The PHQ-9. Validity of a brief depression severity measure

    J Gen Intern Med

    (2001)
  • B Löwe et al.

    Diagnosing ICD-10 depressive episodes: superior criterion validity of the Patient Health Questionnaire

    Psychother Psychosom

    (2004)
  • B Löwe et al.

    Monitoring depression outcomes with the PHQ-9. Responsiveness and reliability

    Med Care

    (2004)
  • RL Spitzer et al.

    Utility of a new procedure for diagnosing mental disorders in primary care. The PRIME-MD 1000 study

    JAMA

    (1994)
  • B Arroll et al.

    Screening for depression in primary care with two verbally asked questions: cross sectional study

    BMJ

    (2003)
  • K Kroenke et al.

    The Patient Health Questionnaire-2: validity of a two-item depression screener

    Med Care

    (2003)
  • Cited by (1029)

    View all citing articles on Scopus
    View full text