Comparison of Childhood Depression Measures

Comparisons of Childhood Depression Measures

Depressive disorders are now recognized as a relatively prevalent problem in adolescents as it is one of the causes of morbidity and mortality in this age group (Birmaher, et al., 1996). Thus initial detection of depression should be as early as possible in order to mitigate the impact of the disorder in an individual’s life by positively amending the long-term course of depression. Current research of depression in adolescents have introduced various assessment tools in diagnosing the disorder in children and adolescents. However, determining the most appropriate measure of depression involves several considerations rather than merely selecting one as a test that lacks compatibility in its reliability and validity may result to a false positive or false negative diagnosis (Reynolds & Mazza, 1998). This paper aims to draw on three self-report assessments for depression in adolescents, naming the Child Depression Inventory (CDI), Beck’s Depression Inventory 2nd Edition (BDI-II) and Reynold’s Adolescents Depression Scale (RADS) in reviewing, comparing and contrasting their psychometric properties. Conclusion on the most appropriate assessment for depression in adolescents will be drawn.
Children’s Depression Inventory (CDI) CDI was originally adapted from Beck Depression Inventory (BDI) by altering its format and language, to measure severity of depression in children age 7 and older (Brooks & Kutcher, 2001). CDI consists of 27 items overlooking cognitive, affective and behavioural signs of depression with each item presenting three alternative statements. Children read the items themselves or could have it read out to them. The test takes an estimate of 10-20 minutes to be completed.
Reliability The CDI has been shown to have high internal consistency, with alpha coefficients slightly above 0.80 (Smucker, Craighead, Craighead & Green, 1986; Saylor, Finch, Spirito & Bennett, 1984; Weiss, Weisz, Politano, Nelson, Carey & Finch, 1991). However, Weis et al. (1991) found modest but significant difference in the factor structure of CDI for children (age 8 to 12) and adolescents (age 13 to 16), where they found there was one item relatively uninvolved in overall CDI depression for the adolescents and six such items for the children. This gathers doubt on the coherence of each CDI item in assessing depression for the two age groups. Furthermore, past findings raise ambiguities regarding test-retest reliability of CDI. Saylor et al. (1984) found promising one-week interval test-retest correlation of 0.87 for emotionally disturbed children, however obtained a weak correlation of 0.38 for ‘normal’ children. More moderate test-retest correlations found by other studies have also indicated CDI to have higher short-term test-rest reliability in a hospitalized psychiatric population than non-psychiatric population (Kovacs, 1981; 1982; Friedman & Butler, 1979).
Validity Several psychometric evaluation studies of the CDI have obtained reasonably high positive scores on CDI’s concurrent validity. Reynolds (1987) evaluated CDI using 2460 adolescents from one high school and two junior high schools and found correlation of 0.73 for CDI and Reynold’s Adolescent Depression Scale (RADS). Furthermore, Saylor et al. (1984) found results indicating that CDI could differentiate general populations of emotionally disturbed children from normal school children, which correlates with Kline, Hodges, Siegel, Mullins and Griffin (1982) studies in comparing scores of clinically referred children and normal school children. However, Kovacs (1992) has found contradicting results where no distinctive difference was found on the CDI scores of clinically depressed children and public school children. It was noted that validity of the CDI might have been impacted by the satisfactory of its cutoff score. Konvacs’ (1992) suggestion of using 20 as cutoff score is argued to be problematic as it would miss 85.8% of truly depressed students if CDI was to be given to a school, as calculated by the Receiver-Operating Characteristic (Matthey & Petrovski, 2002). Several studies that investigated CDI’s cutoff scores lead to the conclusion that no single score that will generally be effective (Asarnow & Carlson, 1985; Smucker et al., 1986). In accordance to Matthey and Petrovski’s (2002) argument, by lowering the cutoff score would decrease the number of non-depressed students correctly scoring low where as if cutoff score of 20 remains, high risk of missing truly depressed students remain.
Reynolds Adolescent Depression Scale (RADS) The RADS (Reynolds, 1987) was developed specifically to measure severity of depression in adolescents of ages 12 through 18 years. The scale consists of 30-item self-rating scale written in accordance to the symptoms listed by the Diagnostic and Statistical Manual for Mental Disorders, Third Edition (DSM-III; American Psychiatric Association, 1980) as well as additional symptoms from Research Diagnosis Criteria (RDC; Spitzer, Endicott & Robins, 1978). The RADS takes around 10 minutes to be completed.
Reliability The RADS has demonstrated strong psychometric properties according to the extensive research conducted. Reliability of the scale has been reported to be relatively high. Its internal consistency has been shown to range from 0.91 to 0.94 for normal students of grades 7 to 12 (Reynolds, 1987). Reynolds and Miller (1985) tested RADS on mild mentally impaired adolescents and obtained slightly lower coefficients, however still a considerably high value (α =0.87). Reynolds (1987) tested the test-retest reliability of RADS using depressed and non-depressed high school students and found coefficient of 0.80 using a 6-week test-retest time frame, while coefficient of 0.79 was also found for a 12-week interval. In addition, Baron and DeChamplain (1990) had given the French translation of RADS to140 adolescents in Quebec and yielded test-retest reliability coefficient of 0.86 using a 3-week interval.
Validity Reynolds (1987) reported obtaining relatively high concurrent validity with other self-report measures of depression such as BDI (0.68 ≤ r ≤ 0.75) and CDI (r = 0.73). In addition, Shain, Naylor and Alessi (1990) also obtained similar correlation strength with the Children’s Depression Rating Scale-Revised (CDRS-R) (r = 0.77). In order to test for criterion validity, Reynolds (1987) used a semi-structured clinical interview of depression – Hamilton Depression Rating Scale (HDRS) and RADS on school adolescents; Reynolds reported high correlation of 0.83. Reynolds (1987) proposed cutoff score of 77 for the RADS suggesting that a score of 77 or above indicates a clinically level of depressive symptom severity. The RADS cutoff score has been validated against HDRS cutoff score of 20 with high school students. It was found that RADS cutoff score yields very high specificity (96%) however its sensitive was relatively low (62%), thus suggesting that 38% of subjects were misdiagnosed as not suffering clinically relevant level of depression when they are suffering from depression. However, Reynolds and Mazza (1998) used cutoff score of 77 on adolescents from 6th, 7th and 8th grades in comparison with HDRS cutoff score and found results that portray higher sensitivity (89%) and 90% specificity. In addition Reynolds and Mazza also found that by lowering RADS cutoff score to 75 might yield sensitivity of100% and specificity of 96%.
Beck’s Depression Inventory – 2nd Edition (BDI-II) The BDI-II is a revised version of the original 21-item BDI (Beck & Steer, 1993), which assesses the severity of depression in adolescents and adults. The BDI-II has been altered to improve its content validity as well as to make the new scores correspond to the criteria for Major Depressive Disorder according to Diagnostic and Statistical Manual of Mental Disorders – fourth edition (DSM-IV). The response time frame has been extended from 1-week to 2-weeks to be consistent with DSM-IV criteria for the symptoms. The assessment takes about 5 to 10 minutes to be completed.
Reliability Studies done in examining psychometric properties of BDI have demonstrated strong internal consistency. Beck, Steer and Brown (1996) administered BDI-II to subjects with psychiatric diagnoses and normal college students and found internal consistency of 0.92 and 0.93. Osman et, al. (1997) did a study on non-clinical young adults and obtained equally high internal consistency ( = 0.90). Beck, Steer and Brown (1996) also found that the 1-week test-retest reliability was high ( = 0.93), which is similar to the finding of Sprinkle, et al. (2002) where they found total test-retest reliability score of 0.96 within the time frame of 1 to 12 days.
Validity Various studies have reported fairly high convergent validity of the BDI-II and other related measures of depression available. For example, Osman et al. (1997) found high correlation (r = 77) between BDI-II and DASS-Depression Scale Score. These authors also obtained evidence of convergent validity for BDI-II as factor scales correlated significantly with related measures of depression such as anxiety, self-esteem and stress. Osman, Barrios, Gutierrez, Williams and Bailey (2007) also found BDI-II to be highly correlated with RADS (r = 0.84). Arnau, Meagher, Norris and Bramson (2001) reported evidence of validity for BDI-II as they found that mean scores of subjects with Major Depressive Disorder (MDD) and non-MDD group differed significantly with scores of MDD subjects higher than the non-MDD subjects. Arnau et al. (2001) also evaluated the cutoff score of BDI-II and found that cutoff score of 18 yielded the best balance between sensitivity (94%) and specificity (92%). They concluded that BDI-II assesses reliable and valid scores for measuring severity of depression in primary care medical setting. Oman, et al. (2007) did a study on non-clinical adolescents sample and obtained similar findings – accuracy of BDI-II in discriminating groups of clinical and non-clinical adolescents. However, as one might expect, BDI-II has its flaws in psychometric properties. Beck, et al. (1996) have used 500 adults of clinical sample as BDI-II norms in examining its factor structure and retained 12 items on its first factor (Somatic-Affective) and 9 items on its second factor (Cognitive-Affective). However when they replicate the analysis using 120 undergraduates, they did not find support for such item factor compositions, but rather found 2 items (pessimism and loss of interest in sex) did not load high on Somatic-Affective nor Cognitive-Affective. Other studies attempted in evaluating BDI-II factor structure have also failed to find support for Beck, et al. (1996) findings. Steer, Rissmiller and Beck (2000) extracted a third factor (Guilt-Punishment) on boys and girls in outpatient setting and to no avail they found 2 items (agitation and loss of interest in sex) failed to load significantly on any of the three factors. These findings raise concern regarding the representation of each item of BDI-II on different populations. Given these psychometric reviews of each depression assessment measures, the bottom line is to decide which is the best choice? There are few criteria to be considered in choosing the most appropriate measure: what is the purpose of assessment? Who are the targeted population? What is the value of the given measures? This paper aims to choose the most appropriate measure in assessing depression in adolescents. With adolescents as the targeted population, it is vital to take into consideration of developmental changes in depression assessment as certain symptoms may differ across different ages (Kendall, Cantwell, Kazdin, 1989). In accordance, CDI falls short in meeting this criterion, as indicated by the coherence of its factor structure that vary significantly for children and adolescents (Weis, et al., 1991) The BDI-II also possesses similar issue, whereby representation of each item on the inventory differs across clinical adults samples, normal undergraduates as well as adolescents. In comparison, RADS was specifically designed in assessing severity of depression in adolescents, thus possesses better coherence in its item structure, as proven with evidence with its high internal consistency obtained within samples of normal and mentally impaired adolescents. Moreover, a reliable measure should produce scores that are relatively stable across a certain period of time as one would expect that depression symptoms do not disappear or ameliorate in short period of time without treatment or these symptoms will not be classified as depressive in accordance to the 2-weeks criteria of DSM-IV. CDI was shown to portray wide difference in its test-retest reliabilities across clinical and non-clinical population. Stability of the scores from RADS and BDI-II are proven to be better whereby both were found to carry better test-retest reliability across clinically depressed and non-depressed subjects. Furthermore, RADS was demonstrated to be able to detect treatment-induced change in depressive symptoms in adolescents (Reynolds & Coats, 1986). Furthermore, an effective valid assessment should discriminate individuals with depression against those that are non-depressed. In accordance, CDI is found lacking in such criterion due to its problematic cutoff score that demonstrated relatively high chances of missing cases of depressed individuals. On another hand, RADS and BDI-II both yield high specificity and appropriate sensitivity in correctly identifying target group. Judging from these review and comparison of psychometric properties, the CDI pale in comparison as a useful screening tool in assessing depression in adolescents in relation to the RADS and BDI-II, in which both carry strong psychometric properties. Although BDI-II consists of items that corresponds to the DSM-IV definition of depression while RADS accords to DSM-III, however Krefetz, Steer, Gulab and Beck (2002) found that RADS and the BDI-II share similar psychometric characteristics and are comparatively effective in assessing adolescents’ depression. Albeit the advantages of BDI-II, the coherence of its items in targeting depression remains ambiguous as item representations vary within different populations, thus it may not provide scores as accurate as compared to the RADS, which items are specifically designed to measure depression in adolescents. Furthermore, although psychometric properties of BDI-II have been extensively researched, however, most of these evaluations focused on clinical and adult populations, thus may not be as precisely applicable to adolescents. In conclusion, RADS appears to be the most appropriate assessment to measure depression in the adolescence populations.

