Background Unidentified depression in primary care is a public health concern, globally. There is a need for brief, valid and easily administered tools in primary care.
Aims To estimate reliability and validity of the newly developed Primary care Screening Questionnaire for Depression (PSQ4D), a four-item tool, with ‘yes’ or ‘no’ options.
Method PSQ4D was administered verbally (time required, <1 min) by primary care physicians to adult outpatients (n=827) in six primary care settings in Kerala, India. A psychiatrist evaluated each patient on the same day, using ICD-10 Diagnostic Criteria for Research, based on unstructured clinical interview.
Results The Cronbach’s alpha for internal consistency reliability was 0.80; kappa coefficient for test–retest reliability was 0.9 and that for interrater reliability was 0.72. At a score ≥2, sensitivity was 0.96, specificity was 0.87, positive predictive value was 0.74, negative predictive value was 0.98, positive likelihood ratio was 7.4 and negative likelihood ratio was 0.05.
Conclusions When physician administered, PSQ4D has good reliability. At a cut-off score of ≥2, it has high sensitivity and specificity to identify depressive disorder in primary care.
Declaration of interest None.
Copyright and usage © The Royal College of Psychiatrists 2017. This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) license.
Depression is projected to be a leading cause of global burden of disease by 2030.1 It also increases the risk for suicide2 and non-communicable diseases such as ischaemic heart disease.3 Its prevalence is 13.2% among women attending general practice clinics in the UK4 and 23% among obese people in Australian clinics.5 In India, depression is the most common psychiatric disorder reported in community settings.6 Community prevalence of depression in India is 15.9%,7 whereas in primary care, it ranges from 218 to 84%.9 An increase in depression prevalence has been reported over the past few decades.10 The objective of the National Mental Health Programme of India was to integrate mental health with general healthcare. At present, the District Mental Health Programme has an organisational mechanism to provide services of a psychiatrist once a mental health problem is identified in primary care. Unfortunately, this facility is underutilised and depression remains undiagnosed. If identified and treatment initiated, outcome in depression is good, and at 1-year follow-up, 71% of patients with depression demonstrated no symptoms or social impairment.11 If left untreated, it aggravates co-existing physical illness and results in more frequent consultations. Early identification and management would reduce the disability and the risk of suicide.12 From an economic perspective, the overall healthcare cost can be reduced, if depression is identified early and treated. But, unless specifically screened, depression remains underrecognised and untreated in the primary care setting.12 Hence, depression screening is an important public health strategy in low- and middle-income countries. The moderate net benefit of screening for depression in adults has also been recognised in high-income country settings (e.g. USA).13
Sensitivity of existing brief tools is high, but specificity is moderate or low. For example, when self-administered, the two questions from the Primary Care Evaluation of Mental Disorders (PRIME-MD) had a sensitivity of 96%, but a specificity of 57%.14,15 When physician administered, these two questions had a sensitivity of 97% and a specificity of 67%.16 In busy clinics, primary care physicians need quick tools with good reliability and validity. In this context, we developed this brief depression screening tool, Primary care Screening Questionnaire for Depression (PSQ4D) in Kerala, a state in South India, which has high literacy and good health indicators, comparable to those of high-income countries. PSQ4D has four items with ‘yes’ and ‘no’ options, which can be administered verbally by the physician while examining a patient. We evaluated the reliability and validity of PSQ4D, when administered verbally by the primary care physicians, compared with the reference standard of psychiatrist-diagnosed ICD-10 depression.
We conducted this study in six primary care settings in the catchment area of the Medical College Health Unit, Pangappara, training centre of Government Medical College, Trivandrum, located in the capital of Kerala. Geographically, these primary care sites cater to a population of approximately 120 000 adults and children. Two sites were primary health centres in the public domain and four were privately run clinics. All these sites followed the ‘drop-in’ consultation model, and patients could consult the physician without a prior appointment.
Participants and study design
This cross-sectional study included all consenting patients, aged 18–60 years, who attended the six primary care sites; only patients already receiving psychotropic medications were excluded. We restricted the study to patients below 60 years, because the elderly require a different scale and cut-off for depression screening. Assuming that the sensitivity of the PSQ4D would be 95%, at a significance level of 5% and with a power of 90%, we estimated that we would need to recruit 165 patients with depression.17 Based on earlier reports that the prevalence of depression in an Indian primary care setting was 21%,8 the sample needed was 788. Considering a non-response rate of around 5%, the final necessary sample size was estimated to be 825. Proportion of patients to be recruited from primary health centres (public sector) and private clinics was 2:1, based on the proportion of the average monthly turnover in these hospitals, during the month before the study. We planned to recruit participants by systematic sampling.
The first three items of PSQ4D were based on the three essential symptoms, as represented in F32, Depressive Episode of the ICD-10.18 These symptoms were depressed mood, reduced interest and lack of energy. A fourth item related to insomnia was added. We conducted in-depth interviews with stakeholders such as psychiatrists, psychologists and primary care physicians to identify the common symptoms in depression, which they can easily identify in out-patient clinics. Consensus opinion of psychiatrists and psychologists was that insomnia was a common symptom and was frequently reported proactively by patients with depression. Primary care physicians strongly voted for insomnia, because it was the most easily elicited symptom in primary care. They suggested that patients would quickly and voluntarily disclose insomnia because it often worries them and it is a more socially acceptable symptom to be reported to a physician, compared with an emotional symptom. All items required ‘yes’ or ‘no’ responses only. The four items of PSQ4D are listed in Table 1.
The ICD-10 DCR
The Diagnostic Criteria for Research (DCR) is designed for use in research. Because of its proven diagnostic utility in mood disorders research,19,20 the DCR was applied by a psychiatrist and used as the reference standard. Participants with a DCR-based diagnosis of depressive disorders (F32.0, F32.1, F32.2 and F32.3) and recurrent depressive disorders (F33.0, F33.1, F33.2 and F33.3) were defined as ‘persons with depressive disorder’ and others as ‘persons without depressive disorder’ in this study.
During routine clinical assessment in their out-patient clinics, primary care physicians verbally administered the PSQ4D in Malayalam, the local language, and scored it. After the primary care physician’s consultation, a psychiatrist, who was masked to the PSQ4D score, evaluated each patient. The psychiatrist diagnosed depression with ICD-10 DCR based on an unstructured clinical interview. To estimate interrater reliability, two primary care physicians administered the PSQ4D independently to a subsample of 118 patients, within a time span of about half an hour. Similarly, to estimate test–retest reliability, the PSQ4D was administered twice to another subsample of 38 persons with a gap of 2 weeks between assessments.
The study protocol was reviewed and approved by the Human Ethics Committee of the Government Medical College, Trivandrum. The design, conduct and analysis of the study were in accordance with the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines.21 Recruitment of patients began in May 2006 and ended in November 2006. Informed consent was obtained from all patients. Those who were diagnosed as having any mental health problem were offered treatment.
Internal consistency reliability of the PSQ4D was estimated by Cronbach’s alpha. Interrater and test–retest reliability were estimated by the kappa coefficient. Receiver Operator Characteristic curve analysis was also done to identify the optimum cut-off and to compare the validity of various combinations of questions in the PSQ4D. The sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and overall accuracy (depressive disorder correctly classified) of the PSQ4D were calculated comparing the score of PSQ4D with psychiatrist-established diagnosis of ICD-10 DCR depressive disorder. Positive and negative likelihood ratios were also estimated.
A total of 848 people were screened (Fig. 1). Of these, 21 could not be included because they were either on psychotropic drugs (n=15) or refused consent to participate (n=6). The sociodemographic description of the sample is provided in Table 2. The overall prevalence of psychiatrist-diagnosed ICD-10 depressive disorder was 27.2%. Its prevalence was 2.1 times higher among women (170/491; 34.6%), compared with men (55/336; 16.4%), and was 2.9 times higher in public (192/552; 34.8%), compared with private (33/275; 12.0%) sites.
Cronbach’s alpha for the PSQ4D was 0.80 for the entire sample. It also remained high in subsamples, based on gender, age and hospital setting. For example, Cronbach’s alpha was 0.75 in men and 0.82 in women; 0.76 in the 18- to 24-year age group (n=109), 0.73 in the 25- to 34-year age group (n=230), 0.85 in the 35- to 44-year age group (n=223), 0.78 in the 45- to 54-year age group (n=185) and 0.77 in the 55- to 60-year age group (n=80); 0.79 in public and 0.78 in private hospital settings. When the PSQ4D test positivity was defined as ‘any two questions positive’ (score ≥2), the kappa coefficient for interrater reliability (n=118) was 0.72 (95% CI 0.61–0.83). Using the same definition for test positivity, the kappa coefficient for test–retest reliability (n=38) was 0.9 (95% CI 0.76–1.0).
The area under the curve (AUC) was maximum for ‘any two questions positive’ (AUC 0.92, 95% CI 0.89–0.94) and minimum for all the four questions positive (AUC 0.75, 95% CI 0.70–0.79). Hence, the optimum cut-off for the PSQ4D was set as ‘any two (or more) out of the four PSQ4D questions positive’ (a score ≥2). Table 3 presents the sensitivity, specificity, predictive values and accuracy for the entire sample and for subgroups. In the total sample, sensitivity and specificity were 96% (95% CI 94.7–97.3) and 87% (95% CI 85.3–89.3) respectively. In subgroups defined by gender, age and hospital setting, sensitivity ranged from 91.2 to 100%, and specificity ranged from 80.4 to 96%. In the whole sample, PPV and NPV were 73.5% (95% CI 70.48–76.52) and 98% (95% CI 97.02–98.95) respectively; these ranged from 73 to 78% and from 97 to 100% respectively, in different subgroups. The exceptions were men, where the PPV was 63%, and those in the age group of 55–60 years, where the NPV was 92%.
Accuracy (correctly classified) of the PSQ4D ‘any two (or more) questions positive’ cut-off was 90% for the whole sample, and 86 to 96% in different subgroups. With a cut-off of ≥2, the positive likelihood ratio (LR+) was 7.4 and the negative likelihood ratio (LR−) was 0.05; both values are generally considered psychometrically satisfactory.
When test positivity was defined as any three (or more) PSQ4D questions positive, sensitivity dropped to 76%, whereas specificity improved to 97% and PPV to 89%. Importantly, when positive responses to specific sets of two or three questions were examined for their ability to identify depressive disorders, specificity remained uniformly high, whereas sensitivity values were lower for combinations other than ‘any two positive’ combinations (Table 4).
We sought to develop a screening instrument, which can accurately identify depression in primary care setting and which takes very little time to administer. We found that, using the PSQ4D, depression was best identified by a positive response to any two or more questions, that is, by a score of 2 or ≥2. At this threshold, specificity, PPV and NPV were also reasonably high, not only in the whole sample but also in sociodemographically defined subgroups (Table 3). These findings suggest the robustness of the construct measured by the PSQ4D.
We found that there was no sensitivity or NPV advantage obtained with a higher cut-off or with different combinations of questions (Table 4). However, with the threshold set at any three questions positive, specificity, PPV and NPV were all high, although sensitivity reduced from 96 to 76%. Given that the purpose of a screening tool is to identify cases with high sensitivity, we suggest that a threshold of (any) two positive questions be set when the PSQ4D is used to identify persons with depression. If, for any reason, whether in clinical or research settings, a higher specificity, PPV and NPV are required, then the threshold can be raised to (any) three positive questions. Whichever threshold is selected, the PSQ4D is very easy to score and interpret.
Importantly, the PSQ4D performed well in different demographic subgroups. The sensitivity of the PSQ4D, which is the primary pre-requisite for a community-level screening tool, was uniformly high in all sociodemographic subgroups. NPV was also uniformly high across the subgroups (Table 3). The specificity was lower in older age groups (80%), compared with younger age groups (91%). It is possible that older patients might require other questions. The specificity (96%) and overall accuracy (96%) were high in the private setting compared with the public setting (81 and 86% respectively). The higher level of education of patients attending private clinics might have contributed to the higher accuracy of the PSQ4D in the private setting. In spite of these variations across the subgroups, all the validity measures were in an acceptably high range, with the overall accuracy between 86 and 96% (Table 3).
The overall prevalence of depression in this study is 27.2%. In India, earlier studies have reported high prevalence of depression in primary care, and this varies widely from 21 to 84%.8,9 The relatively low depression prevalence in private clinics (12%) in our study is comparable to the depression prevalence in general practice and primary care settings in high-income countries.4 The high prevalence of depression in public hospitals might be a reflection of the high level of socio-economic stressors among the patients who seek care from these hospitals. They are socially and economically more deprived and have lower levels of education, compared with those who attend private hospitals.
We wondered whether participants with a history of depression might have biased the results,22 because they might more readily endorse depression on screening. Among the 827 patients who were enrolled, 19 patients (2.3%) had a previous psychiatric consultation, but were not on any treatment at the time of recruitment. We did not exclude them, because the purpose of a screening instrument is to identify anyone who is currently depressed, regardless of past depression status. In the final analysis, 11 among them had depression. They formed just 4.9% of patients with depression (11/225), and hence, those with a positive history might not be a major bias in this study.
A recent editorial, which revisited the two-question tool, has reiterated the need for depression screening in primary care settings using tools with high sensitivity.15 The psychometric properties of the PSQ4D are comparable with or better than those of similar depression screening instruments. For example, the PSQ4D has good internal consistency reliability (Cronbach’s alpha 0.8), which is similar to the value of 0.79 obtained with the much longer PHQ-9 in a previous Indian study.23 The sensitivity and specificity of the PSQ4D (96 and 87% respectively) compare very favourably with the sensitivity and specificity (97 and 67% respectively) of the shorter, two-question verbal screening measure for depression that was validated in Auckland.16 The Auckland investigators had earlier added a question on the ‘need for help’ to their two-question screening tool24 and reported that the sensitivity of the instrument remained high at 96%, and the specificity improved to 89%. However, a more recent cohort study on their three-question tool (with the help item) had substantially lower sensitivity (59.4%), although specificity was retained (88.2%).25 Unlike ‘desire for help’ which was added by the Auckland investigators,24 we chose to ask about ‘lack of energy’, which is the third essential criterion in ICD-10. We asked about insomnia as the fourth question, because insomnia is highly prevalent in depression and easily elicited in primary care. Insomnia is voluntarily reported by patients, without the stigma attached to revealing emotional symptoms. In this regard, our effort to improve specificity by adding two more questions and the choice of questions paid dividends.
Performance of the PSQ4D is better than the sensitivity (91%) and specificity (76%) of the single item of the PHQ-9 administered on touch screen, in comparison with the reference standard of the full version of the PHQ-9.26 Although this was a single item, it had five response options,26 and this necessitated the use of a touch screen or written scale for administration, which may be difficult in low-resource settings. The advantages of the PSQ4D for use in low- and middle-income countries are that no additional appointment, infrastructure or human resource was required for the screening, because the procedure involves asking four questions with ‘yes’ or ‘no’ options. The entire screening process takes less than a minute because the questions are asked verbally by the treating physician.
Strengths and limitations
The fact that the PSQ4D has ‘yes’ or ‘no’ options makes it easy to administer, verbally, especially in situations where, because of heavy case-loads, time pressures are of concern. Measures of validity, especially sensitivity and NPV, are high across all subgroups. A limitation of this study is that depressive disorder diagnosed by a psychiatrist based on ICD-10 DCR is the ‘gold standard’. In busy primary care settings in a low-resource country, ICD-10 DCR, which has proven diagnostic utility in research,20 administered by a psychiatrist was the feasible alternative to a structured interview. We studied the psychometric properties of the Malayalam version of the PSQ4D in private and public settings in Kerala, South India. It needs to be validated in other languages and other settings. This is a limitation of all newly developed tools. All four items of the PSQ4D are related to the four symptoms of ICD-10 diagnostic criteria, and none of them is culture specific. Hence, we expect the PSQ4D to perform well in other settings.
We acknowledge the Faculty of Clinical Epidemiology Resource and Training Centre (CERTC), where P.S.I. developed the PSQ4D, as part of the research conducted for her MPhil (Clinical Epidemiology) degree. We acknowledge Dr J. Visalakshi, Lecturer, Department of Biostatistics, Christian Medical College, Vellore, for planning the analysis.
- Received March 23, 2016.
- Revision received March 7, 2017.
- Accepted March 16, 2017.
- © 2017 The Royal College of Psychiatrists
This is an open access article distributed under the terms of the Creative Commons Non-Commercial, No Derivatives (CC BY-NC-ND) license (http://creativecommons.org/licenses/by-nc-nd/4.0/).