A methodological perspective on the analysis of clinical and personality questionnaires Smits, Iris Anna Marije

University of Groningen A methodological perspective on the analysis of clinical and personality questionnaires Smits, Iris Anna Mare IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2014 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Smits, I. A. M. (2014). A methodological perspective on the analysis of clinical and personality questionnaires Groningen: s.n. Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 16-10-2018

Chapter 7 The Strengths and Difficulties Questionnaire (SDQ) in Community and Clinical Populations Abstract The Strengths and Difficulties Questionnaire (SDQ) is a popular screening instrument for the detection of social-emotional and behavioral problems in children in community and clinical settings. To sensibly compare the SDQ scores across these settings, the SDQ should measure psychosocial difficulties and strengths in the same way across community and clinical populations, that is, the SDQ should be measurement invariant across both populations. We examined whether measurement invariance of the parent version of the SDQ holds using data from a community sample (N = 707, of age 7 to 13, M = 9.66, SD = 1.42) and a clinical sample (N = 939, of age 2 to 14, M = 7.53, SD = 1.87). The results of the measurement invariance analysis, based on a multigroup confirmatory factor analysis, indicate that measurement invariance of the SDQ parent version across community and clinical populations is tenable. This implies that the SDQ can be sensibly used in both populations and meets its purpose as a screening instrument that can be used in both community and clinical settings. This chapter has been submitted for publication as: Smits, I. A. M., Theunissen, M. H. C., Reneveld, S. A., Nauta, M. H., & Timmerman, M. E. The Strengths and Difficulties Questionnaire (SDQ) in community and clinical populations.

124 Chapter 7 7.1 Introduction Many children suffer from psychosocial problems (see e.g., Costello, Mustillo, Erkanli, Keeler, & Angold, 2003). Early prevention programs targeted on child behavioral problems may prevent or reduce such psychosocial problems and associated mental disorders. It is therefore important to detect children with a high risk for the development of psychosocial problems at an early stage. The Strengths and Difficulties Questionnaire (SDQ) is a popular screening instrument for the detection of such socialemotional and behavioral problems in children (e.g., Stone, Otten, Engels, Vermulst, & Janssens, 2010). The SDQ is a short, freely available questionnaire of psychosocial difficulties and strengths in children and adolescents from 3 to 16 years of age. It was developed with the aim to measure both children s social-emotional and behavioral problems and their prosocial behavior (Goodman, 1997). The questionnaire consists of 25 items, measuring emotional symptoms, conduct problems, hyperactivity-inattention, peer problems, and prosocial behavior (Goodman, 1997), and is available in a teacher, selfreport, and parent version, in more than 60 different languages to date (see www.sdqinfo.com). The SDQ is widely used, both as a screening instrument in community populations and as a clinical assessment instrument in clinical populations, for instance for the evaluation of treatment success (see e.g., Stone et al., 2010). Although the SDQ is widely used in community and clinical care, the obectives of the use in the two settings is quite diverse. In community care, the SDQ is used to identify children with potential psychosocial problems. In contrast, in clinical care, children with psychosocial problems have already been identified, and the SDQ is used to delineate the nature of the child s social-emotional and behavioral problems relative to norm groups or to evaluate treatment success. Thus, the SDQ is used for different reasons in a community versus a clinical population (Stone et al., 2010). To be able to compare the SDQ scores of children across community and clinical settings, it is of key importance that the SDQ measures psychosocial difficulties and strengths in the same way across community and clinical populations. This can be examined by testing whether the SDQ is measurement invariant across community and clinical populations.

SDQ in Community and Clinical Populations 125 A test is measurement invariant if it measures the same construct in the same way across various populations. That is, given the level of the trait that is measured, the expected scores on the test should be population independent (Mellenbergh, 1989). In the case of the SDQ, this implies that if two children suffer to the same extent from a particular psychosocial difficulty, then they should have the same expected score on the associated SDQ subscale, irrespective of the population they stem from. Thus, the items should be equally attractive to affirm positively for (parents or teachers of) children who suffer equally from psychosocial problems but are from different populations (e.g., from a community or clinical population). Only if a test is measurement invariant, differences in test scores between children from different populations can be attributed to differences in the level of the trait (e.g., differences in conduct problems). In contrast, if a test is not measurement invariant, the scores mean something different in each population and cannot be compared directly. Therefore, measurement invariance is a pivotal prerequisite for the sensible interpretation of test scores across populations. Despite the importance of the establishment of measurement invariance, studies on the SDQ devoted to this topic are rare. Within the community population, regarding the parent version of the SDQ, support for measurement invariance is found across gender of the child, age group of the child, number of siblings of the child, maternal age group, educational level of the parent, ethnicity of the grandparent, and the survey method used (Hill & Hughes, 2007; Palmieri & Smith, 2007; Stone et al., 2013). Regarding the teacher version of the SDQ, support for measurement invariance is found across gender of the child, ethnicity of the child and ethnicity of the teacher (d Acremont & van der Linden, 2008; Hill & Hughes, 2007; Zwirs et al., 2011). Furthermore, across informants, evidence for partial measurement invariance is found (Sanne, Torsheim, Heiervang, & Stormark, 2009). However, to our knowledge, there are no studies on the SDQ devoted to measurement invariance across community and clinical populations. Thus, whether the SDQ is measurement invariant across community and clinical populations is still an unanswered question. Therefore, the aim of this study is to examine whether measurement invariance of the SDQ holds across community and clinical populations. We examined this for the parent version of the SDQ, since parents are the most studied informants of the SDQ (see e.g., Stone et al., 2010).

126 Chapter 7 7.2 Method 7.2.1 Participants Clinical sample Data were collected from parents whose child had been referred to Accare for a variety of mental health issues. Accare is a mental health organization for general child and adolescent psychiatry, with several centers that are merely outpatient (for circa 90%), and serves the Northern part of the Netherlands. Data were collected online at the intake assessment as part of routine outcome monitoring. A total of 939 parents participated in the study. They completed the parent version of the SDQ for their child (674 boys, 265 girls, of age 2 to 14, M = 7.53, SD = 1.87). Cases were excluded from analyses if 1) The child did not belong to the target population of the SDQ (i.e., children younger or older than 3 to 16 years old, see www.sdqinfo.com; excluded: 1 case; 0.1%) or 2) If less than three valid items per subscale were available (a prerequisite as subscribed by the manual of the SDQ, see www.sdqinfo.com; excluded: 3 cases; 0.3%). The resulting data set consists of N = 935 parent reports of children between 3 and 14 years old (M = 7.54, SD = 1.86), with 672 boys and 263 girls. This data set contained only a limited number of missing data (summary statistics of the number of missing items per parent, across 25 items: M = 0.10, SD = 0.40, min = 0, max = 4). The missing data were imputed using Two-Way imputation with normally distributed Errors (TW-E; van Ginkel, van der Ark, & Stsma, 2007), which is a suitable method for handling missing questionnaire scores. In Table 7.1 some demographic characteristics of the sample are presented. In Table 7.2, an overview is given of the primary diagnosis of the children in the sample that were established by a child psychologist or psychiatrist, based on a regular intake evaluation.

SDQ in Community and Clinical Populations 127 Table 7.1. Demographic Characteristics of the Children in the Clinical and the Community Samples Characteristic Gender Child Household Native Country Mother Native Country Father Educational Level Mother Educational Level Father Clinical Sample Total N (%) Community Sample Total N (%) Boys 672 (71.9) 349 a (49.4) Girls 263 (28.1) 356 0 (50.4) Two-parents family 387 b (41.4) 607 c (85.9) Single-parent family 75 (8.0) 73 0 (10.3) Other 11 (1.2) 9 0 (1.3) The Netherlands 484 d (51.8) 621 e (87.8) Other 6 (0.6) 39 0 (5.5) The Netherlands 479 f (51.2) 610 g (86.3) Other 9 (1.0) 46 0 (6.5) Lower education n/a n/a 234 h (33.1) Medium education n/a n/a 255 0 (36.1) Higher education n/a n/a 192 0 (27.2) Lower education n/a n/a 184 i (26.0) Medium education n/a n/a 223 0 (31.5) Higher education n/a n/a 231 0 (32.7) a Missing: N = 2 (0.3%); b Missing: N = 462 (49.4%); c Missing: N = 18 (2.5%); d Missing: N = 445 (47.6%); e Missing: N = 47 (6.6%); f Missing: N = 447 (47.8%); g Missing: N = 51 (7.2%); h Missing: N = 26 (3.7%); i Missing: N = 69 (9.8%). Note. n/a = No available information.

128 Chapter 7 Table 7.2. Description of the Primary Diagnosis of the Children within the Clinical Sample DSM Total N (%) Attention-Deficit/Hyperactivity Disorder 384 (41.1) Autism spectrum disorders 214 (22.9) Trauma and stress-related disorders 88 (9.4) Relational problems (V-codes in DSM-IV, including parent-child relation problems, bereavement, abuse and neglect) 60 (6.4) Anxiety and mood disorders 46 (4.9) Disorder of Infancy, Childhood, or Adolescence NOS 42 (4.5) Disruptive, impulse-control, and conduct disorders 29 (3.1) Learning Disorder NOS 12 (1.3) Other 31 (3.3) Unknown 29 (3.1) Note. NOS = Not Otherwise Specified. Community sample Data were collected from parents of primary school children in the Netherlands. A total of 707 parents participated in the study. They completed the parent version of the SDQ for their child (349 boys, 356 girls, 2 unknown sex, of age 7 to 13, M = 9.66, SD = 1.42). The sample was representative of the Dutch population, apart from an underrepresentation of parents of children of immigrant origin (see Crone, Vogels, Hoekstra, Treffers, & Reneveld, 2008; Vogels, Crone, Hoekstra, & Reneveld, 2009). The same exclusion criteria were maintained as with the clinical sample. In the community sample, no cases had to be excluded. The data set contained a limited number of missing data (summary statistics of the number of missing items per parent, across 25 items: M = 0.05, SD = 0.31, min = 0, max = 6). The missing data were imputed using the same procedure as in the clinical sample. In Table 7.1 some demographic characteristics of the sample are presented. For further details of the sample we refer to Vogels et al. (2009).

SDQ in Community and Clinical Populations 129 7.2.2 Instrument We use the Dutch parent version of the SDQ (van Widenfelt, Goedhart, Treffers, & Goodman, 2003). The SDQ consists of 25 items, each offering a short description of an attribute, evaluating emotional symptoms, conduct problems, hyperactivityinattention, peer problems and prosocial behavior (Goodman, 1997). Parents rate on a three-point rating scale the degree to which they consider each of the attributes applicable to their child s behavior of the past six months. The SDQ comprises a single strength scale, covering prosocial behavior, and four difficulty scales, covering emotional symptoms, conduct problems, hyperactivity-inattention, and peer problems. An overall difficulty score can be computed from the four difficulty scales. 7.2.3 Analysis To test whether the SDQ is measurement invariant across community and clinical populations, we tested whether the same measurement model holds for the community and the clinical populations. We evaluated this by carrying out a multigroup confirmatory factor analysis (CFA) for categorical data and examining whether the parameters of the measurement models were equal across the community and the clinical samples (see e.g., Meredith, 1993). Here, we describe the multi-group CFA model for categorical data. In this model, it is assumed that the observed polytomous response variable stem from a categorization of an underlying continuous response variable, via thresholds. If y denotes the p dimensional vector of the observed ordered polytomous scores of subect i in group on p items, then the observed ordered polytomous variable related to the latent continuous variable * y via thresholds k as: * y c if y (1), c 1 c y is for categories c = 1,, C, with C the number of categories (i.e., = 1, 2, p = 25 and C = 3 here).

130 Chapter 7 For the latent continuous variable where loadings, y *, we specify the following linear factor model: * y, (2) is a p dimensional vector of item intercepts, is a (p q) matrix of factor is a q dimensional vector of common factor scores and is a p dimensional vector of residuals. We specify such that each factor represents a single scale (i.e., q = 5) and that each factor is only associated with the items pertaining to the scale involved; the latter implies that each item has only a single non-zero loading. Further, it is assumed that ( ) 0, with ( ) the expected value of the residuals, that are mutually independent and that and are independent. The mean and variance of * y are respectively: *, (3) *, (4) where is a q dimensional vector of common factor means, is a (q q) covariance matrix of the common factors, and residuals. is a diagonal covariance matrix of Measurement invariance (MI) is established by evaluating whether the parameters of the measurement models are equal across groups (i.e., across = 1, 2 here). This can be done by successively imposing equality constraints on the measurement parameters between the community and clinical groups, and subsequently, testing the tenability of these equality constraints. All models were fitted in the Mplus program (Muthén & Muthén, 1998-2007), using weighted least squares means and variance adusted (WLSMV) estimation, a suitable procedure for the categorical CFA (see e.g., Beauducel & Herzberg, 2006). Below, we will describe the analysis procedure. 7.2.4 Procedure First, we fitted the configural invariance model (Model A). In this model, the measurement parameters are allowed to be unequal across groups. To identify the

SDQ in Community and Clinical Populations 131 model, in Mplus the item intercepts are fixed at zero in all groups ( 0, = 1, 2), the residual variances of the reference group are fixed to one for all items (yielding an identity matrix, i.e., 1 = I, with = 1 the reference group), and the first two thresholds of each variable are constrained to be invariant across groups ( ; ). Because the SDQ items have three categorical response 11 21 12 22 options, the latter implies that all thresholds are constrained to be invariant across groups. Furthermore, to identify the common factor distribution ( ), the variances of all q common factors are fixed at one (i.e., diag ( 1 ), for = 1, 2, with diag ( ) the diagonal of the elements of ). In addition, for the reference group (i.e., = 1), the mean ( ) of the common factors is fixed at 0. The factor mean of the 1 1 community group ( ) is identified by the invariance of the thresholds. For a nice 2 overview of identification constraints of the factor model for ordered categorical data in the multiple group setting, we refer to Millsap and Yun-Tein (2004). If the configural invariance model did not fit well, we inspected the modification indices to determine whether there were any correlated residuals that may cause the misfit. The modification index gives an indication of the expected change in the chisquare if a parameter restriction is relaxed. If the modification index suggests that the addition of a correlated residual may improve the model, we added this to the model if it improved the fit of the model substantially and if the residual covariance was interpretable, this would then result in Model A2. Next, we fitted the strong factorial invariance model (Model B). Starting from Model A2, the factor loadings were constrained to be equal over groups (i.e., ). 1 2 Because the factor variance of the community group (diag ( )) is identified by the 2 invariance of the factor loadings, those factor variances were freely estimated. Finally, we fitted the strict invariance model (Model C). Starting from Model B, the residual variances were constrained to be equal across groups (i.e., yielding ). 1 2 The model fit was evaluated by considering the root-mean-square error of approximation (RMSEA; Steiger, 1990), the comparative fit index (CFI; Bentler, 1990) and the change in the comparative fit index ( CFI; see Cheung & Rensvold 2002). For completeness, we also reported the chi-square and the chi-square Difftest. We considered the prevailing convention for indicators of acceptable fit of RMSEA

132 Chapter 7.08, and CFI.95 (e.g., Schermelleh-Engel, Moosbrugger, & Müller, 2003). As criterion for accepting the hypothesis of invariance, we used a maximal value of difference in CFI between two successive models of 0.01 (Cheung & Rensvold, 2002). If the successive constraints between the models were accompanied with a strong decline in fit, we inspected the modification indices to determine the cause of the misfit. 7.3 Results Children in the clinical sample scored on average higher on the SDQ difficulty scales than children in the community sample. On average, the children of the clinical sample scored within the clinical range at the four difficulty scales and the Total Difficulty scale (Total Difficulties: M = 18.11, SD = 6.28; Emotional Symptoms: M = 4.38, SD = 2.65; Conduct Problems: M = 3.56, SD = 2.33; Hyperactivity: M = 7.16, SD = 2.61; Peer Problems: M = 3.01, SD = 2.35) and within the normal range on the Prosocial Behaviour scale (Prosocial Behaviour: M = 7.04, SD = 2.28). Children of the community sample scored on average within the normal range at the four difficulty scales, the Total Difficulty scale, and the Prosocial Behaviour scale (Total Difficulties Score: M = 6.61, SD = 5.17; Emotional Symptoms Score: M = 1.71, SD = 1.92; Conduct Problems Score: M = 1.04, SD = 1.39; Hyperactivity Score: M = 2.72, SD = 2.49; Peer Problems Score: M = 1.14, SD = 1.51; Prosocial Behaviour Score: M = 8.62, SD = 1.63). 7.3.1 Measurement Invariance Analyses The results of the measurement invariance analyses of the parent version of the SDQ across community and clinical groups are presented in Table 7.3. First we fitted the configural invariance model (Model A, see Table 7.3). This model did not fit adequately (RMSEA =.063; CFI =.863). Inspecting the modification indices, we found five covariances between item residuals that improved the fit of the model substantially and were interpretable given the item content.

SDQ in Community and Clinical Populations 133 Table 7.3. Model Fit Results for the Measurement Invariance Analysis of the Parent Version of the SDQ Across Community and Clinical Groups Model 2 df 2 Difftest df Difftest p value RMSEA RMSEA 90% CI CFI CFI A. Configural invariance 2334.178 550.063 [.060 -.065].863 A2. Configural invariance a 1823.295 540.054 [.051 -.057].901 B. Strong factorial invariance 1844.414 560 81.154 20 < 0.005.053 [.050 -.056].901.000 C. Strict factorial invariance 1918.950 590 127.142 30 < 0.005.052 [.050 -.055].898.003 Note. RMSEA = root-mean-square error of approximation; CFI = comparative fit index; CFI = difference in CFI between successive models. a Five item residual covariances freed. For instance, one of the correlated residuals concerned the items Q9 Helpful if someone is hurt, (...) and Q20 Often volunteers to help others ( ). The other item pairs were Q10 and Q2, Q22 and Q18, Q24 and Q16 and Q25 and Q15. For each item pair, the items concerned are highly comparable in item content or have similar wording. Allowing these residuals to covary in both groups, resulted in an acceptable model fit: The CFI value was still somewhat low, while the RMSEA value indicated a good fit (Model A2: RMSEA =.054; CFI =.901). Subsequently, we constrained the factor loadings to be equal across groups (Model B, see Table 7.3). The model fit was still acceptable; the restriction of equal factor loadings across group was not associated with a deterioration in model fit (RMSEA =.053; CFI =.901; CFI = 0.000). Finally, we constrained the residual variances to be fixed to 1 in both groups and thus to be equal across groups, resulting in the strict invariance model (Model C, see Table 7.3). This model showed a very limited drop in model fit, and fitted reasonably well (RMSEA =.052, CFI =.898; CFI = 0.003). These results imply that measurement invariance of the parent version of the SDQ across the community and clinical group seems tenable.

134 Chapter 7 7.4 Discussion The aim of this study was to examine whether the parent version of the SDQ is measurement invariant across community and clinical populations. Our results suggest that measurement invariance of the parent version of the SDQ across both populations is tenable. This implies that the comparison of SDQ scores of children across community and clinical settings is permitted. This is essential, since children who are assessed with the SDQ in a community setting, and those who are assessed in a clinical setting during an intake should get scores that are comparable with each other. This is the first study which evaluates whether the SDQ, which is designed to be used in both community and clinical settings, measures psychosocial problems equally in the two settings. We note that the demographic characteristics of the community and clinical sample diverged slightly. The children in the clinical sample were relatively younger (Age: M = 7.54, SD = 1.86 versus M = 9.66, SD = 1.42 in the community sample) and consisted of relatively more boys (72% boys versus 49% boys in the community sample). We do not consider this to be problematic, because earlier studies on the parent version of the SDQ provided support for measurement invariance across gender of the child and age group of the child. Furthermore, in the community sample immigrant children were underrepresented compared to the national population. However, this underrepresentation matches the underrepresentation of immigrant children in the clinical sample. The latter reflects the demographic composition of the Northern part of the Netherlands in which those data were collected. Therefore, it is unlikely that this underrepresentation has affected our findings. Future research is needed on different community and clinical samples, to endorse that measurement invariance of the SDQ across community and clinical populations holds. Furthermore, future research should reveal whether measurement invariance of the SDQ across community and clinical populations is also tenable when other informants than the parents of the child are used. As a limitation we note that the fit values of the models of our measurement invariance analysis were somewhat heterogeneous. Although the RMSEA values indicated a good fit (around.05), the CFI values were somewhat low (around.90). This is line with CFI values of previous MI studies on the SDQ, which were commonly

SDQ in Community and Clinical Populations 135 between.80 and.90 (d Acremont & van der Linden, 2008; Hill & Hughes, 2007; Sanne et al., 2009) and occasionally between.90 and.95 (Zwirs et al., 2011) or around.95 (Palmieri & Smith, 2007; Stone et al., 2013). We could have improved the fit of the model by adapting the model through item deletion or allowing for more model modifications, but we wanted to adust the model as few as possible to keep in line with the actual use of the SDQ. In sum, this study shows that measurement invariance across community and clinical populations seems tenable for the parent version of the SDQ. This implies that the SDQ measures the same construct in the same way for children across the two settings. It suggests that parents interpret the SDQ items equally, whether they are in a community or in a clinical setting. This is fortunate since it is common clinical practice to interpret the scores of a clinical individual relatively to norm scores which are based on community samples. The findings of this study give support for the continued use of the parent version of the SDQ in community and clinical settings.