Psychological Assessment

Size: px
Start display at page:

Download "Psychological Assessment"

Transcription

1 Psychological Assessment Preliminary Validation of the Rating of Outcome Scale and Equivalence of Ultra-Brief Measures of Well-Being Jason A. Seidel, William P. Andrews, Jesse Owen, Scott D. Miller, and Daniel L. Buccino Online First Publication, April 21, CITATION Seidel, J. A., Andrews, W. P., Owen, J., Miller, S. D., & Buccino, D. L. (2016, April 21). Preliminary Validation of the Rating of Outcome Scale and Equivalence of Ultra-Brief Measures of Well- Being. Psychological Assessment. Advance online publication. pas

2 Psychological Assessment 2016 American Psychological Association 2016, Vol. 28, No. 4, /16/$ Preliminary Validation of the Rating of Outcome Scale and Equivalence of Ultra-Brief Measures of Well-Being Jason A. Seidel Colorado Center for Clinical Excellence, Denver, Colorado Jesse Owen University of Denver William P. Andrews Pragmatic Research Network, Manyother Ltd., Bangor, Wales Scott D. Miller International Center for Clinical Excellence, Chicago, Illinois Daniel L. Buccino Johns Hopkins Bayview Medical Center, Baltimore, Maryland Three brief psychotherapy outcome measures were assessed for equivalence. The Rating of Outcome Scale (ROS), a 3-item patient-reported outcome measure, was evaluated for interitem consistency, test retest reliability, discriminant validity, repeatability, sensitivity to change, and agreement with the Outcome Rating Scale (ORS) and Outcome Questionnaire (OQ) in 1 clinical sample and 3 community samples. Clinical cutoffs, reliable change indices, and Bland-Altman repeatability coefficients were calculated. Week-to-week change on each instrument was compared via repeated-measures-corrected effect size. Community-normed T scores and Bland-Altman plots were generated to aid comparisons between instruments. The ROS showed good psychometric properties, sensitivity to change in treatment, and discrimination between outpatients and nonpatients. Agreement between the ROS and ORS was good, but neither the agreement between these nor that between ultrabrief instruments and the OQ were as good as correlations might suggest. The ROS showed incremental advantages over the ORS: improvements in concordance with the OQ, better absolute reliability, and less oversensitivity to change. The ROS had high patient acceptance and usability, and scores showed good reliability, cross-instrument validity, and responsiveness to change for the routine monitoring of clinical outcomes. Keywords: outcome measure, rating scale, reliable change, Bland-Altman, effect size Clinicians should integrate the best available research evidence with clinical expertise while monitoring patient progress and adjusting practices accordingly (APA Presidential Task Force on Evidence-Based Practice, 2006, pp. 273, 276; also see National Quality Forum, 2013). To aid clinicians in this endeavor, several brief patient-reported outcome measures (PROMs) are available at low or no cost for monitoring patient progress in psychotherapy. However, there is not yet widespread use of such measures (Harmon et al., 2007; Swift, Greenberg, Whipple, & Kominiak, 2012; Tarescavage & Ben- Porath, 2014). Many clinicians have reported concerns about time burden and difficulty with interpreting scores from outcome measures (Tarescavage & Ben-Porath, 2014), with most clients and clinicians not tolerating even 5 min of outcomes assessment for session-by-session measurement (Australian Mental Health Outcomes and Classification Network, 2005; Miller, Duncan, Brown, Sparks, & Claud, 2003). Given the resource constraints faced in typical clinical settings, practitioners and clients would benefit from very brief, psychometrically sound, and more interpretable PROMs. PROMs such as the Outcome Questionnaire 45.2 (OQ; Lambert et al., 2004), Outcome Rating Scale (ORS; Miller, Duncan, Brown, Sparks, & Claud, 2003; Miller & Duncan, 2004), Clinical Outcomes in Routine Evaluation (CORE) sys- Jason A. Seidel, Colorado Center for Clinical Excellence, Denver, Colorado; William P. Andrews, Pragmatic Research Network, Manyother Ltd., Bangor, Wales; Jesse Owen, Counseling Psychology Department, University of Denver; Scott D. Miller, International Center for Clinical Excellence, Chicago, Illinois; Daniel L. Buccino, Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland. Research was performed at the Colorado Center for Clinical Excellence. The authors wish to thank Bob Bertolino, PhD, for his invaluable help in the pilot phase of this project. Jason Seidel is a developer of Rating of Outcome Scale. William Andrews is a developer of Pragmatic Tracker software platform for Rating of Outcome Scale, Outcome Rating Scale, CORE-10, and other outcome measures. Scott Miller developed and is the co-owner of the copyright for the Outcome and Session Rating Scales. Individual practitioners may obtain a free, lifetime license to use the scales by registering at products/performancemetrics-licenses-for-the-ors-and-srs. Researchers may obtain permission to use the ORS by writing to info@scottdmiller.com. Correspondence concerning this article should be addressed to Jason Seidel, Colorado Center for Clinical Excellence, 1720 South Bellaire Street, Suite 204, Denver, CO jseidel@thecoloradocenter.com 1

3 2 SEIDEL, ANDREWS, OWEN, MILLER, AND BUCCINO tem (including the CORE-OM and CORE-10; Connell & Barkham, 2007), and Behavioral Health Measure (Kopta & Lowry, 2002), were designed to assess changes in general psychological functioning over the course of therapy, and their psychometric properties have been supported in dozens of studies (e.g., A. Campbell & Hemsley, 2009; Connell & Barkham, 2007; Janse, Boezen-Hilberdink, van Dijk, Verbraak, & Hutschemaekers, 2014; Kopta & Lowry, 2002; Lambert et al., 2004; Reese, Norsworthy, & Rowlands, 2009). However, little attention has been paid to the agreement between measures. Most researchers in the field have relied on correlation coefficients to assess concurrent validity and test retest reliability rather than score agreement and repeatability. Correlations are inappropriate because they measure the strength of association (relative reliability), not agreement or absolute reliability (Bland & Altman, 1986; Vaz, Falkmer, Passmore, Parsons, & Andreou, 2013). Correlations can be weak when scores agree strongly, and correlations can be strong while scores show poor agreement. What clinicians need to know is whether the scores from one instrument are equivalent to the scores from another instrument, not just whether (as with height and weight) they are strongly associated while being quite different. Bland- Altman plots (Bland & Altman, 1986) are used frequently in medical research and to a lesser extent in psychological research to show agreement between two methods for measuring a dependent variable. While some instrument comparisons (e.g., digital vs. mercury thermometers) are likely to have both precision (i.e., repeatable or similar, not merely correlated measurements) and high agreement with other instruments, comparisons between more complex measurement methods and variables (e.g., treatment response in leukemia; Müller et al., 2009) are likely to show limited repeatability and agreement. Along these lines, it is common for patient reports of subjective experiences such as pain to yield low test retest repeatability and intermethod agreement (DeLoach, Higgins, Caplan, & Stiff, 1998; Lima, Barreto, & Assunção, 2012). Given the relative lack of precision in PROMs that measure complex subjective experiences, it is especially important to begin to quantify and if possible improve on the repeatability of scores and the agreement between ostensibly similar measures. PROMs in routine practice vary in length (from 4 to over 60 items). The ORS is an ultrabrief measure (four items) developed especially with the intention of balancing brevity with validity, based on the 45-item OQ. While the ORS has been used widely, concerns include (a) the poor repeatability of visual analogue scales for subjective experiences, even when taken only 1 min apart, and greater interrater agreement with a Likert-type response format (Bijur, Silver, & Gallagher, 2001; DeLoach et al., 1998; Wuyts, De Bodt, & Van de Heyning, 1999); (b) no published analyses of agreement or repeatability for the ORS; and (c) no published analyses comparing the ORS s sensitivity to change with other instruments. Clearly, there is room for more than one ultrabrief measure for tracking client outcomes in therapy; but more importantly, further research is needed on the comparability of ultrabrief measures given their relatively higher rates of acceptance by clients and clinicians. Accordingly, the current study tested a new three-item measure called the Rating of Outcome Scale (ROS; see the Appendix; Seidel, 2011) in clinical and community samples, and compared it with the ORS and OQ. Method Participants One clinical sample and three community samples were administered combinations of measures in counterbalanced order, either once or approximately 1 week apart. The clinical sample received all three measures at each session for up to three sessions. The Community1 sample received a single administration of the ROS and OQ; the Community2 sample received a single administration of all three measures, and the Community3 sample received three administrations of the ROS and ORS. Clinical sample. The clinical sample was drawn from a private outpatient psychotherapy practice with multiple clinicians. Adult clients who came for at least one session of psychotherapy in their first treatment episode at the practice (n 279) were informed that the practitioners routinely collected client data as an integral part of Feedback-Informed Treatment (FIT; see Bertolino & Miller, 2013). Eighty-six of the clients were administered the three measures in each session for up to three sessions. Of these 86, 78 clients attended two sessions and completed all measures, and 73 attended three sessions and completed all measures. The remaining 193 clients received the three measures only in the first session. Of the 279 participants, 89% identified as White or Caucasian, and 51.3% were female. The median education of the sample was 16 years (range 8 26), approximately 26% of the clients in the sample sought reimbursement from their insurance or a health spending account, and approximately 11% paid a reduced fee or no fee. Per the general policy of the practice, Diagnostic and Statistical Manual of Mental Disorders and International Statistical Classification of Diseases and Related Health Problems diagnoses were rarely assigned in this general outpatient practice. Clients own descriptions of their presenting problems (primarily from free-response questions on an intake questionnaire) were used to classify them for this study. Among the top five presenting problem areas, 62.1% of clients reported some kind of relational problem at intake; 27.2% reported stress, anxiety, or trauma concerns; 20.7% reported problems with mood or depression; 12.4% reported identity/role concerns or confidence issues; and 8.3% had anger problems. While 58.6% reported only one problem area at intake, 27.8% reported two problem areas, and 13.6% reported three or more major categories of distress at intake. Three community samples. The first community sample (Community1; n 708) was drawn from U.S.-based members of an online labor-market community (Amazon.com s Mechanical Turk, or Mturk) consisting of a pool of over 2 million workers who complete brief tasks in exchange for very small payments. The sample was not selected or screened for nonclinical levels of distress. Like the other community samples, they were asked about past and current psychotherapy or psychiatric medication treatment; 9.3% reported past counseling or psychotherapy, psychiatric medication, or both, and (in contrast to the other two community samples) current users of psychotherapy or medication were excluded from participating. Increasingly, social science researchers have used Mturk as a source for research participants, finding that they are more demographically representative and diverse than

4 VALIDATION OF THE RATING OF OUTCOME SCALE 3 most samples of convenience, although they tend to skew young, male, and politically liberal (Berinsky, Huber, & Lenz, 2012; Richey & Taylor, 2012). Demographically, 62.7% were male, 76.3% identified as White or Caucasian, 9.5% identified as Asian, 5.6% identified as Black or African American, 5.1% identified as Hispanic or Latino, and 3.5% reported mixed race/ethnicity or other. All 708 completed the ROS, and 417 of the 708 completed the OQ. A second community sample (Community2; n 102) was obtained from the adult customers of a full-service car wash. Community2 participants completed a counterbalanced administration of the ROS, ORS, and OQ. Of the 102 participants, 101 provided complete ORS and ROS data, and 99 completed all three measures. Of the 101 participants, 53.5% were female, 82.4% identified as White or Caucasian, 5.9% identified as mixed or multiple races or ethnicities, and 5.0% identified as Latino or Hispanic. Notably, 12.9% of the Community2 sample reported they were currently receiving psychotherapy, psychiatric medication, or both; however, excluding them had no appreciable impact on the descriptive statistics, so they were included in the sample. A third community sample (Community3; n 116) was drawn from a social and professional network centered in Ireland and the United Kingdom. Originally, 162 people consented to participate, with 116 completing the ROS and ORS in counterbalanced order at all three time points in a Web-based survey at 1-week intervals (actual days between administrations M 7.53, SD 1.93). All 116 identified as White, Caucasian, or European, and 54.3% were female. In the Community3 sample, 7.8% reported that they were currently receiving psychotherapy and/or psychiatric medication; and since excluding them had no appreciable impact on the descriptive statistics, they were included in the sample. Measures Rating of Outcome Scale (ROS). The ROS is a three-item measure of well-being using an 11-point Likert-type response format for each item (item range: 0 10; total score range: 0 30). The scale takes less than 30 s to administer and 2 4 s to score by hand. Clients are presented with the measure at the start of each session and instructed to fill in one of the circled numbers from 0 to 10 for each of the three items based on how they have been feeling in the past week. Instructions and anchors show that a score of 0 indicates the worst they have felt and a score of 10 indicates the best they have felt. The three ROS items are modeled on the OQ s three subscales: internal well-being, relational well-being, and task-functioning well-being. However, the ROS items pertain to degree of distress rather than frequency of distressing experiences as in the OQ. The first ROS item (corresponding to the OQ s Symptom Distress subscale) asks, How do I feel inside, how are my gut feelings? The wording is in line with the OQ s many felt-sensation-oriented items in this subscale (e.g., I tire quickly, I feel weak, and I have an upset stomach ). The second ROS item, How are my relationships with people I care about? was constructed following feedback from the first author s clients using the ORS that a broader array of relationships were of importance to clients well-being than those specified in the ORS item: Interpersonally: (Family, close relationships). The ROS item corresponds well with Interpersonal Relations subscale items on the OQ that involve general relational needs (e.g., I get along well with others and I feel loved and wanted ). The third ROS item, How am I doing with tasks (work, school, home)? is more similar to the OQ items about social role (e.g., I feel stressed at work/school, and I find my work/school satisfying, ) than the ORS item: Socially: (Work, School, Friendships). The ROS item was broadened to include home to accommodate those who work from home, are stay-at-home parents, or who are retired or otherwise not employed but still have task functions in which they want to engage effectively to experience higher levels of well-being. Outcome Rating Scale (ORS). The ORS (Miller et al., 2003) is an ultrabrief four-item well-being measure using a 100-mm visual analogue scale response format for each item, and is scored by summing the distances of the four marks from their respective zero points (item range: cm; total score range: cm). It takes less than 1 min to administer and s to score by hand using a ruler and calculator (computer-based administration and scoring are available through several publishers). Clients make a mark on each line corresponding to how they have been doing in the past week with marks to the left indicating low levels and marks to the right indicating high levels. Outcome Questionnaire-45.2 (OQ). The OQ consists of 45 items in a 5-point Likert-type response format (item range: 0 4, total score range: 0 180; Lambert et al., 2004). The OQ is a frequency-of-distress scale; therefore higher scores indicate higher distress (i.e., greater frequency of negative experiences, lower frequency of positive experiences) rather than greater well-being. It takes approximately 5 to 10 min to administer and 3 6 min to score by hand. There are three subscales: symptom distress, interpersonal relations, and social role. Subscale items along with interpolated items are summed for each subtotal, and the three subtotals are summed for a total score. Statistical Analyses, Assumptions, and Benchmarks To standardize scores between instruments with different ranges and variances, T scores were calculated for the ROS, ORS, and OQ, based on available community-sample means and standard deviations. Sources for the ROS were the Community1, Community2, and Community3 samples in this study; sources for the ORS were Community2, Community3, Miller et al. (2003), and Bringhurst et al. (2006) samples; sources for the OQ were Community1, Community2, the first four samples listed in Table 1 of Lambert et al. (2004), Miller et al. (2003), and Bringhurst et al. (2006) samples. Equivalence between community samples was tested via independent t tests (two tailed). Bland-Altman 1-week repeatability (test retest) plots of ROS and ORS T scores were generated for Community3, along with coefficients of repeatability (CR). The CR is conceptually similar to the reliable change index (RCI; see below), and is a measure of the smallest possible change... that represents true/real change... [and] accounts for both random and systematic error (Vaz et al., 2013, p. 6). CR was calculated using both Bland and Altman (1986) and Vaz et al. (2013) methods. In addition to the repeatability plots, Bland-Altman agreement plots between Session-1 T scores on the ROS, ORS, and OQ were generated for the clinical sample.

5 4 SEIDEL, ANDREWS, OWEN, MILLER, AND BUCCINO Correlational analysis (including Steiger s Z test) between concordant and discordant items was used to explore discriminant validity. Internal consistency was assessed with Cronbach s alpha. Interitem correlation coefficients (Pearson s r) were calculated to explore the association between ROS and ORS items and the three OQ subscales. Sensitivity to change was assessed through t tests and effect size corrected for repeated measures (ES RMC ; Seidel, Miller, & Chow, 2014). Test retest reliability (Time 1 to Time 2 in the Community3 sample, Pearson s r) was used to calculate RCIs (Jacobson & Truax, 1991). T score conversions of RCIs were calculated by dividing each instrument s RCI by its respective raw score standard deviations and multiplying by 10. Raw score clinical cutoffs for the ROS, ORS, and OQ were established via Jacobson and Truax s (1991) criterion c, using the clinical sample means and standard deviations (N 279) and the unweighted mean of all available nonclinical sample means and standard deviations (ROS: Community1, Community2, and Community3; ORS: Community2, Community3, Miller et al., 2003, and Bringhurst et al., 2006; OQ: Community1, Community2, Table 1 in the work of Lambert et al., 2004, Miller et al., 2003, and Bringhurst et al., 2006). A common T score-based clinical cutoff was set after adjusting for the directionality of measures (i.e., higher T scores corresponding with greater well-being). Finally, previously published clinical cutoffs for the ORS and OQ (i.e., 25 and 63.5, respectively) were compared with the common T score cutoff for the ROS, ORS, and OQ. Discordance Table 1 T Score Means and Differences between measures in categorizing members of the clinical sample (n 279) as distressed or nondistressed was assessed at Time 1 using McNemar s test with Yates s correction for continuity ( 0.5). Instrument agreement also was assessed with Cohen s kappa ( ). Results T score means and SDs of total scores for the clinical and community samples are shown in Table 1. T scores for the ROS, ORS, and OQ were based on averaged unweighted communitysample means (22.31, 29.60, and 46.98, respectively) and standard deviations (4.55, 6.80, and 19.40, respectively). The community samples differed from one another, with Community1 most distressed (though still about 1 SD better than the clinical sample on the ROS and about 0.5 SD better than the clinical sample on the OQ; see Table 1) and Community2 least distressed, expressed in significant differences (all ps.01) on all possible independent t tests between the samples on the ROS, ORS, and OQ. Community1 and Community2 differed from one another by almost 1 SD on both the ROS (t 7.173, p.001, d 0.91) and OQ (t 5.716, p.001, d 0.83). Community1 and Community3 differed from one another by 0.4 SD on the ROS (t 3.342, p.002, d 0.40). Community2 and Community3 differed from one another by about 0.5 SD on both the ROS (t 4.116, p.001, d 0.56) and ORS (t 4.162, p.001, d 0.57). Test 1 Test 2 Test 3 Sample n M SD M SD M SD t p ROS Clinical Clinical (T1 T3) Community Community Community (Distressed) ORS Clinical Clinical (T1 T3) Community Community (Distressed) OQ Clinical Clinical (T1 T3) Community Community Between Instruments Clinical ROS ORS Clinical ROS OQ Clinical ORS OQ Note. Paired-sample, two-tailed t tests were conducted between Time 1 and Time 3. Distressed subsample from Community3 only included participants who reported no current psychotherapy or medication. Between-instruments t tests were concurrent at Session 1. ROS Rating of Outcome Scale. ORS Outcome Rating Scale. OQ Outcome Questionnaire T1 T3 cohort with three administrations.

6 VALIDATION OF THE RATING OF OUTCOME SCALE 5 Cronbach s alpha and test retest reliabilities are reported for the ROS and ORS in Table 2. Interitem correlations are shown in Table 3. The ROS alpha levels were good, given only three items (alpha increases as a function of the number of items and item redundancy). The mean interitem correlation was.48 (range.39.60) for the clinical sample, and.61 (range.54.65) for the community sample. The ROS and ORS had similar test retest reliabilities. Bland-Altman 1-week repeatability plots for ROS and ORS T scores in Community3 were assessed for bias, following the method suggested by Bland and Altman (1986). There was minimal bias between test and retest for either the ROS or the ORS (calculated by subtracting Time-2 T scores from Time-1 T scores and determining the mean difference: M diff 0.5; Table 2 and Figure 1). However, as expected, the Limits of Agreement (LoA) between test and retest were somewhat broad in this community sample, yielding average CR for the ROS and ORS of 13.7 and 17.0 (i.e., 95% of retest scores falling within 1.37 and 1.70 SD of test scores), respectively (see Table 2). Bland-Altman agreement plots between clinical-sample T scores from the ROS, ORS, and OQ at Session 1 were broad as well, often exceeding 1 SD of difference between simultaneous scores on different instruments (see Figure 2), and the 95% LoA ranging Table 2 Reliability, Agreement, and Change Indices for T scores Sample n r xx M diff (bias) a SD diff a ES RMC b from 1.5 to over 2 SD (see Table 2). Bias between the ROS and ORS was virtually nil, with 49% of the difference scores (i.e., ORS T score minus ROS T score) being negative (M diff 0.03). Between concurrent ROS and ORS administrations, 97.1% of the clinical sample was within 20 T score points (2 SD) of difference, 84.6% was within 10 T score points of difference, and 57.3% was within 5 points of difference. Bland and Altman s (1986) method of bias adjustment was used on the OQ scores: the average M diff of 3.61 from the between-instruments ROS OQ and ORS OQ T scores (see Table 2) was subtracted from OQ T scores. After adjusting for bias, 39.4% of the sample was within 5 points of difference (M diff 0.15) between OQ and ROS; and 39.4% of the sample was within 5 points of difference (M diff 0.11) between OQ and ORS. The results of t tests and the LoA between measures at Session 1 are provided in Tables 1 and 2, respectively. Clinical-sample ROS and ORS scores were (as a group) quite similar to each other; but both instruments yielded significantly lower well-being T scores at Session 1 than the OQ did (via t test; see Table 1). Discriminant validity for an ultrabrief scale of three items cannot be assessed through traditional methods (e.g., Campbell & Fiske, 1959). However, preliminary support for the ROS as a measure of psychological well-being distinct from a measure of 95% LoA LB (95% CI) 95% LoA UB (95% CI) CR RCI ROS Clinical Clinical (T1-T3) Community Community Community a ( 15.9 to 11.5) 13.4 (11.2 to 15.6) 13.8, (Distressed) ORS Clinical Clinical (T1 T3) Community Community a ( 20.0 to 14.6) 16.3 (13.5 to 19.0) 17.1, (Distressed) OQ Clinical 279 Clinical (T1 T3) Community1 417 Community2 99 Between instruments Clinical ROS ORS c ( 15.9 to 14.4) 15.2 (14.5 to 16.0) Clinical ROS OQ c ( 17.6 to 15.6) 23.8 (22.8 to 24.8) Clinical ORS OQ c ( 16.6 to 14.7) 22.8 (21.8 to 23.7) Note. Distressed subsample from Community3 only included participants who reported no current psychotherapy or medication. Cronbach s alpha. r xx test retest correlation; SD diff is between subjects; ES RMC repeated-measures corrected effect size; LoA LB limits of agreement, lower bound; LoA UB limits of agreement, upper bound; CI confidence interval; CR coefficient of repeatability (formulas from Bland & Altman, 1986; Vaz et al., 2013); RCI reliable change index; ROS Rating of Outcome Scale. ORS Outcome Rating Scale; OQ Outcome Questionnaire 45.2; T1 T3 cohort with three administrations. a Time 1 to Time 2. b Time 1 to Time 3. c Concurrent reliability.

7 6 SEIDEL, ANDREWS, OWEN, MILLER, AND BUCCINO Table 3 Correlations Between Well-Being Components at Time 1 OQ SD OQ IR OQ SR ROS 1 ROS 2 ROS 3 OQ SD OQ IR OQ SR ROS ROS ROS OQ SD OQ IR OQ SR ORS 1 ORS 2 ORS 3 OQ SD OQ IR OQ SR ORS ORS ORS Note. Community1 2 (values in italics; n 516) and Community2 (n 99) sample correlations are above the diagonals. Clinical sample (n 279) correlations are below the diagonals. Corresponding items and subscales are in bold type. Pearson s r correlations between OQ and the other instruments are negative due to reverse scoring of the OQ (higher OQ scores indicate distress rather than well-being). OQ Outcome Questionnaire 45.2; SD Symptom Distress Subscale; IR Interpersonal Relationships subscale; SR Social Role subscale; ROS Rating of Outcome Scale; ORS Outcome Rating Scale; ROS 1 ROS item 1. Two tailed, all significant at p.01. physical well-being (a related but different construct) was provided by correlational analysis between ROS total score and a selection of three OQ items in the clinical (n 279) and Community1 (n 413) samples. Two psychological well-being items from the OQ were reported by Lambert et al. (2004) as discriminating well between clients and controls in terms of change from Time 1 to Time 2 ( I feel fearful and I feel blue ). One physical Figure 1. Bland-Altman repeatability plots for the ROS and ORS (Community3 sample). Lines represent the 95% confidence interval of the limits of agreement. ROS Rating of Outcome Scale; ORS Outcome Rating Scale. well-being item ( I have sore muscles ) was reported by Lambert et al. (2004) as not discriminating well between clients and controls. Correlations between the OQ items and the ROS total score were.39 and.53 for the concordant items ( I feel fearful and I feel blue ) and.19 for the discordant item ( I have sore muscles ) in the clinical sample. Steiger s Z was 2.74 (df 274, p.01) between the correlation of ROS and each OQ concordant item versus the correlation of ROS and the OQ discordant item. Correlations between the OQ items and the ROS total score were.60 and.68 for the concordant items and.21 for the discordant item in the Community1 sample. Steiger s Z was 7.34 (df 411, p.01) between the correlation of ROS and each OQ concordant item versus the correlation of ROS and the OQ discordant item. These significant differences provide some support for the ROS as a measure of psychological well-being. To illustrate the association between elements of the well-being construct on the different measures, correlations between ROS and ORS items and OQ subscales for both the community and clinical samples are presented in Table 3. Correlations between similar constructs (e.g., ROS item 1 and OQ Symptom Distress) are in bold type, with the ROS and OQ items yielding correlation coefficients between.50 and.60 in the clinical sample, and between.65 and.74 in the community sample. Correlations between less identical aspects of well-being (e.g., task-oriented and relationship-oriented items) were generally, but not consistently, lower. The repeated-measures cohort (n 73) of the clinical sample reported about 0.4 SD more distress on the ROS and ORS at intake (Session 1) than on the OQ. By the third session, these differences between measures diminished. However, despite the convergence between the different measures by Session 3, Figure 3 shows the effect of the Session-1 difference on repeated-measures-corrected pre-post estimates of change (ES RMC ) from Session 1 to Sessions 2 and 3. The Sessions 1 3 ES RMC for the ROS, ORS, and OQ were 0.75, 0.83, and 0.44, respectively (Figure 3 and Table 2). The

8 VALIDATION OF THE RATING OF OUTCOME SCALE 7 Figure 2. Bland-Altman agreement plots for the ROS, ORS, and OQ (clinical sample). Lines represent the 95% confidence interval of the limits of agreement. ROS Rating of Outcome Scale; ORS Outcome Rating Scale; OQ Outcome Questionnaire. results of paired t tests between Session 1 and 3 were similarly significant for the three measures (see Table 1). Figure 3 shows relatively little change on the ROS and ORS for Community3 from Time 1 to Time 2 and 3; however, the distressed (but untreated) subsample of Community3 showed considerable clinical change (ES RMC 0.48 and 0.59, respectively) by Time 3. RCIs were calculated as critical values based on Jacobson and Truax s (1991) method. That is, the standard error of the difference (S diff ) based on test retest reliability and standard deviation of Time 1 scores averaged across community samples was multiplied by 1.96 (i.e., p.05, two tailed). RCIs (i.e., critical values) for the ROS, ORS, and OQ were 6.6, 10.7, and 18.6 points, respectively; T score RCIs were 14.5, 15.7, and 9.6. These newly calculated RCIs reduced the variability between how these instruments measured clinical change and also reduced the proportion of people classified as changed (see Figure 4). Raw score clinical cutoffs for the ROS, ORS, and OQ s distressed range scores were calculated as 19.4, 25.2, and 56.6, respectively. At these cutoff scores, an average of 20.5% of the clinical sample scored in the nondistressed range on both of two instruments at Session 1 (21.5% on both ORS and OQ,.49; 19.4% on ROS and OQ,.44; and 20.8% on ORS and ROS,.55). As a crossmethod check on the raw score cut-off for the ROS, a Response Operator Characteristics (ROC) analysis was undertaken, using the clinical sample (N 279) versus all community samples (N 925). The area under the curve (AUC) was.75 (p.001; 95% confidence interval [CI] [.72,.78]). The ROC cutoff was 19.5 for the ROS at the point of balance between sensitivity and specificity (.71 and.66, respectively). Using all available community samples for the ORS (n 217) and OQ (n 516) raw scores yielded ROC-based cutoffs of 26.5, and 58.5, respectively. Jacobson and Truax s (1991) criterion c formula was applied to the T scores (with high scores corresponding to greater well-being) yielding cutoff values within 0.5 points of 44.0 for each measure. The T score value of 44 categorized a similar proportion as nondistressed (namely, 19.7% of the clinical sample scoring in the nondistressed range on both of two instruments at Session 1: 20.4% on ORS and OQ; 19.7% on ROS and OQ; and 19.0% on ORS and ROS). In comparison, using a T score cutoff of 43 led to an average of 15.9% scoring as nondistressed on both of two instruments, and a T score cutoff of 45 led to an average of 22.3% of the sample being characterized as nondistressed). The cutoff based on an approximately 20% nondistressed subsample established a common best-fit cutoff T score of 44 for clinical distress. As a cross-method check on the T score cut-off for the ROS, a ROC analysis was performed using the clinical sample (n 279) versus all community samples (n 925), leading to the same results as from the raw scores: The AUC was.75 (p.001; 95% CI [.72,.78]). The ROC cutoff was 44 for the ROS at the point of balance between sensitivity and specificity (.71 and.66, respectively). Using all available community samples for the ORS (n 217) and OQ (n 516) raw scores yielded ROC-based T score cutoffs of 45.5, and 44.1, respectively. Using the T score RCIs and clinical cutoff resulted in 70% agreement between the OQ and ORS and 71% agreement between the OQ and ROS in reliable change classifications. For clinically significant change classifications, the OQ and ORS agreed in 74% of cases, as did the OQ and ROS. In contrast, previously published RCIs and cutoffs resulted in a 63% agreement between the OQ and ORS for both reliable change and clinically significant change classifications. Using the cutoff score of 44 for the clinical sample (n 279), McNemar s test showed no discordance between ROS and ORS T Figure 3. Change in well-being T scores from Time 1 to Time 2 and 3 (clinical and Community3 samples). ROS Rating of Outcome Scale; ORS Outcome Rating Scale; OQ Outcome Questionnaire.

9 8 SEIDEL, ANDREWS, OWEN, MILLER, AND BUCCINO Figure 4. Effect of new RCIs and clinical cutoffs on change classifications in the clinical sample and distressed Community3 subsample (Time 1 to Time 3). ROS Rating of Outcome Scale; ORS Outcome Rating Scale; OQ Outcome Questionnaire; RCI reliable change index. score-based categorization of distressed and nondistressed ( , df 1, p.95,.52). However, there was discordance from the OQ T scores on both the ROS ( , p.02,.42) and the ORS ( , p.01,.46), with both of the ultrabrief measures classifying more people as distressed than the OQ did. Discussion This study addressed the equivalence of two ultrabrief measures of general well-being (both based on the OQ) for assessing clinical outcomes. Cronbach s alpha and test retest reliabilities for the new ROS were good, especially for a three-item measure. Mean interitem correlations and ranges were also good (Clark & Watson, 1995). Bland-Altman analysis showed virtually no bias between the ROS and ORS. Parameters for acceptable agreement have not been established for method-comparison studies of well-being, and agreement is likely to be modest given the complexity and variability of a brief, general, psychological well-being construct. Using a rather stringent benchmark of 0.5 SD of difference as agreement for comparative purposes, the 57% agreement between the ultrabrief measures versus the 39% agreement between each of these measures and the OQ exemplifies the degree to which the measures vary (using a very broad benchmark of 2 SD yielded 96% agreement for all three comparisons). The ROS s sensitivity to change was excellent in a clinical sample that was expected to change, and appropriately low in a community sample that was not expected to change, however both the ROS and ORS Session-1 scores showed more distress than did the OQ scores, indicating that these ultrabrief measures may be more sensitive to change due to clients more distressed responses (relative to community norms) prior to treatment. Correlations between the ROS items and OQ subscales compared favorably with those between the ORS and OQ. Discriminant validity for the ROS was explored by comparing the total score of this general well-being scale with two psychologically oriented items and one somatic item on the OQ with differential responses to psychotherapy in previous research. The ROS correlated better with the psychological items than the somatic item. These findings together suggest good sensitivity, specificity, and construct validity. RCIs and clinical cutoffs were calculated and compared with previously published benchmarks and methods, and the new methods improved concordance between measures, as illustrated in Figure 4. Results from ROC analyses very closely replicated the Jacobson & Truax-based methods for raw and T score-based clinical cutoffs of 19.4 and 44 for the ROS, and showed good diagnostic accuracy based on the AUC. Using Jacobson and Truax s (1991) recommendation of the test retest reliability coefficient (for a community sample not receiving a clinical intervention) with an unweighted mean of independent community-sample standard deviations, resulted in a more stringent categorization of reliable change compared with previous methods (e.g., Lambert et al., 2004; Miller et al., 2004). The current method also reduced the variability between the different instruments in how they categorized change in contrast to previously reported RCIs (that were based on Cronbach s alpha). In contrast to an RCI of 5 calculated by Miller and Duncan (2004), we obtained a substantially greater raw score RCI of In comparison, Janse et al. (2014) found an RCI of 9 for the ORS in their Dutch sample (no details were provided about their method). It may be that different sample variances and different RCI construction methods may contribute to the variability in the RCI for the ORS among different studies. We recommend Jacobson and Truax s original method of test retest reliability (rather than Cronbach s alpha) as the most appropriate basis for determining measurement error. In the context of judging the reliability of scores from an instrument purportedly measuring clinical change, it is most important to account for error in the form of moment-by-moment fluctuations in well-being that everyone may experience from time to time, such that clinically important fluctuations of well-being are those beyond these background levels. Establishing the reliability in a relatively short time (1 week) in an untreated community sample provides the appropriate background levels of change for estimating reliable clinical change from these scores. It should be noted that slightly less change was required on the ROS than the ORS to meet CR and RCI thresholds for reliable or meaningful change. Thus, while the ORS appeared to be more sensitive to change (as evidenced in greater effect sizes), the ROS provided a modest improvement in the signal-to-noise ratio between change and error. The new, T score-based clinical cutoff of 44 reduced the differences between scales in the categorization of clients as clinically significantly changed (CSC; i.e., initially distressed clients with change RCI and crossing the clinical cutoff), although substantial differences remained in CSC classifications between the OQ and the ultrabrief measures. It should be noted, in addition, that much of the gain in agreement in CSC was due to fewer respondents meeting the new RCI and cutoff thresholds. Despite the improvements in agreement between the three measures by using these new methods, significant discordance remained between the ultrabrief measures and the OQ in terms of classifying people as distressed or not distressed in Session 1. Finally, the degree to which distressed community members showed significant improvement from test to retest deserves comment. While at first this might cause concern about the degree to which change in a clinical sample should be considered a result of treatment (if community members can change so dramatically

10 VALIDATION OF THE RATING OF OUTCOME SCALE 9 without it), the measures do not differentiate between people in terms of how long they have suffered prior to assessment. People generally do not seek treatment for transitory disturbances in their well-being, but rather when their efforts to improve it have not been effective. Therefore, the improvement of the distressed community subsample is not comparable to that which might be seen in an untreated or waitlisted clinical sample. The current study is limited in several ways that need to be addressed in future research. The variation among independently sampled communities of adults shows the importance of recruiting a variety of independent samples using different sampling methods to better represent the community norms against which one wishes to benchmark well-being and change. Community samples are not the same as nonclinical samples. All communities contain a proportion of clinically distressed individuals who may or may not be in treatment. Normative samples may not obtain data about current treatment (e.g., Lambert et al., 2004). In the current study, the community samples were questioned about current treatment, however inclusion of these data was not uniform between the samples (i.e., Community1 excluded current treatment recipients, while the other two community samples included these participants). While highly practical, using only ultrabrief outcome instruments in clinical practice limits the information practitioners and clients can use to make informed clinical decisions and judge clinical change. One way to balance the desire for more detailed information with the feasibility of ultrabrief session-by-session measurement is to use longer instruments intermittently: for example, at every fifth session. However, the apparently greater sensitivity of the simultaneously administered ultrabrief instruments versus the OQ leaves open a different question: to what extent are different instruments that purport to measure general subjective well-being interchangeable? It is unclear, for example, whether the apparently greater sensitivity of the ultrabrief instruments to distress at intake was the result of the length of these instruments or the nature of the items (which while based on the OQ subscales were different from the longer OQ in the generality of the item wording, the greater range of their item-response formats, and the focus on degree of distress rather than frequency of experience). These questions clearly require a greater research focus on the cross-validation of instruments used to measure psychotherapy outcomes. Other limitations are the use of a single clinical sample for estimating the relative sensitivity-to-change among instruments; and the lack of diagnostic categorizations of clients or standardized administration or exclusion of medication (though it should be noted that distress levels at intake were comparable with those reported in clinical samples from previous research, e.g., Anker, Duncan, & Sparks, 2009; Bauer, Lambert, & Nielsen, 2004; Miller et al., 2003; Okiishi et al., 2006; Reese et al., 2009). Further research in diverse settings will clarify the extent to which the current findings can be generalized to other client samples. A related limitation was the small number of therapists in the sample which precluded nested analyses that might further clarify the relationship between specific instruments, therapist factors (such as the way individual therapists might administer the instruments), and measured change. The first item of the ROS ( How have I felt inside, how have my gut feelings been? ) implies both interoceptive and intuitive appraisals of experience. These related processes have been shown to influence motivation, awareness, and interpersonal behavior (Dunn et al., 2010; Farb et al., 2015; Garfinkel & Critchley, 2013). Yet, the wording of the item may not direct clients clearly enough to focus their attention on interoception. Further development of this item (particularly in the process of translation to other languages and cultures) is warranted. Finally, this study examined the degree of change over three administrations. It is possible that differences between measures might attenuate or intensify through a longer course of treatment, and that different conclusions might be drawn if longer-term data were collected. There has been increasing focus on greater accountability from clinicians in monitoring clients treatment progress; and there is compelling evidence that formally monitoring progress can improve clinical outcomes. However, the length of most measures and the lack of a clear framework for making sense of outcome data are hurdles preventing greater utilization. For clinicianresearchers who use PROMs in FIT or other forms of practicebased evidence, the ROS appears to be a good instrument to consider and it adheres to guidelines set forth by the National Quality Forum: it measures important, face valid aspects of wellbeing to which clients relate; it shows promising levels of reliability and validity; it is extremely feasible to administer; and it demonstrates appropriate responsiveness to clinical change without being overly reactive. By using rigorous but easy-to-calculate common benchmarks for interinstrument crosswalks, its data appear to be comparable (with some caveats) to two of the most broadly available outcome measures currently in use: the ORS and the OQ. More importantly, methods for assessing equivalence between instruments require continued exploration to increase the validity of score interpretations from ultrabrief PROMs. References Anker, M. G., Duncan, B. L., & Sparks, J. A. (2009). Using client feedback to improve couple therapy outcomes: A randomized clinical trial in a naturalistic setting. Journal of Consulting and Clinical Psychology, 77, APA Presidential Task Force on Evidence-Based Practice. (2006). Evidence-based practice in psychology. American Psychologist, 61, Australian Mental Health Outcomes and Classification Network. (2005). Adult national outcomes & casemix collection standard reports (1st ed., version 1.1). Brisbane, Australia: Author. Barkham, M., Bewick, B. M., Mullin, T., Gilbody, S., Connell, J., Cahill, J.,... Evans, C. (2013). The CORE-10: A short measure of psychological distress for routine use in the psychological therapies. Counselling & Psychotherapy Research, 13, Bauer, S., Lambert, M. J., & Nielsen, S. L. (2004). Clinical significance methods: A comparison of statistical techniques. Journal of Personality Assessment, 82, Berinsky, A. J., Huber, G. A., & Lenz, G. S. (2012). Evaluating online labor markets for experimental research: Amazon.com s Mechanical Turk. Political Analysis, 20, mpr057 Bertolino, B., & Miller, S. D. (Eds.). (2013). ICCE manuals on feedbackinformed treatment (Vols. 1 6). Chicago, IL: ICCE Press. Bijur, P. E., Silver, W., & Gallagher, E. J. (2001). Reliability of the visual analog scale for measurement of acute pain. Academic Emergency Medicine, 8,

11 10 SEIDEL, ANDREWS, OWEN, MILLER, AND BUCCINO Bland, J. M., & Altman, D. G. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. Lancet, 327, Bringhurst, D. L., Watson, C. S., Miller, S. D., & Duncan, B. L. (2006). The reliability and validity of the outcome rating scale: A replication study of a brief clinical measure. Journal of Brief Therapy, 5, Campbell, A., & Hemsley, S. (2009). Outcome rating scale and session rating scale in psychological practice: Clinical utility of ultra-brief measures. Clinical Psychologist, 13, Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait multimethod matrix. Psychological Bulletin, 56, Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7, Connell, J., & Barkham, M. (2007). CORE-10 user manual (version 1.1). Rugby, UK: CORE System Trust and CORE Information Management Systems Ltd. DeLoach, L. J., Higgins, M. S., Caplan, A. B., & Stiff, J. L. (1998). The visual analog scale in the immediate postoperative period: Intrasubject variability and correlation with a numeric scale. Anesthesia and Analgesia, 86, Dunn, B. D., Galton, H. C., Morgan, R., Evans, D., Oliver, C., Meyer, M.,... Dalgleish, T. (2010). Listening to your heart. How interoception shapes emotion experience and intuitive decision making. Psychological Science, 21, Farb, N., Daubenmier, J., Price, C. J., Gard, T., Kerr, C., Dunn, B. D.,... Mehling, W. E. (2015). Interoception, contemplative practice, and health. Frontiers in Psychology, 6, Garfinkel, S. N., & Critchley, H. D. (2013). Interoception, emotion and brain: New insights link internal physiology to social behaviour. Commentary on: Anterior insular cortex mediates bodily sensibility and social anxiety by Terasawa et al. (2012). Social Cognitive and Affective Neuroscience, 8, Harmon, S. C., Lambert, M. J., Smart, D. M., Hawkins, E., Nielsen, S. L., Slade, K., & Lutz, W. (2007). Enhancing outcome for potential treatment failures: Therapist client feedback and clinical support tools. Psychotherapy Research, 17, Jacobson, N. S., & Truax, P. (1991). Clinical significance: A statistical approach to defining meaningful change in psychotherapy research. Journal of Consulting and Clinical Psychology, 59, Janse, P., Boezen-Hilberdink, L., van Dijk, M. K., Verbraak, M. J. P. M., & Hutschemaekers, G. J. M. (2014). Measuring feedback from clients: The psychometric properties of the Dutch Outcome Rating Scale and Session Rating Scale. European Journal of Psychological Assessment, 30, Kopta, S. M., & Lowry, J. L. (2002). Psychometric evaluation of the Behavioral Health Questionnaire 20: A brief instrument for assessing global mental health and the three phases of psychotherapy outcome. Psychotherapy Research, 12, Lambert, M. J., Morton, J. J., Hatfield, D., Harmon, C., Hamilton, S., & Reid, R. C. (2004). Administration and scoring manual for the Outcome Questionnaire Salt Lake City, UT: OQ Measures. Lima, E. P., Barreto, S. M., & Assunção, A. A. (2012). Factor structure, internal consistency and reliability of the Posttraumatic Stress Disorder Checklist (PCL): An exploratory study. Trends in Psychiatry and Psychotherapy, 34, Miller, S. D., & Duncan, B. L. (2004). The Outcome and Session Rating Scales: Administration and scoring manual. Chicago, IL: Institute for the Study of Therapeutic Change. Miller, S. D., Duncan, B. L., Brown, J., Sparks, J. A., & Claud, D. A. (2003). The outcome rating scale: A preliminary study of the reliability, validity, and feasibility of a brief visual analog measure. Journal of Brief Therapy, 2, Müller, M. C., Cross, N. C. P., Erben, P., Schenk, T., Hanfstein, B., Ernst, T.,... Hochhaus, A. (2009). Harmonization of molecular monitoring of CML therapy in Europe. Leukemia, 23, National Quality Forum. (2013). Patient reported outcomes (PROs) in performance measurement. Washington DC: Author. Retrieved March 16, 2015, from Patient-Reported_Outcomes_in_Performance_Measurement.aspx Okiishi, J. C., Lambert, M. J., Eggett, D., Nielsen, L., Dayton, D. D., & Vermeersch, D. A. (2006). An analysis of therapist treatment effects: Toward providing feedback to individual therapists on their clients psychotherapy outcome. Journal of Clinical Psychology, 62, Reese, R. J., Norsworthy, L. A., & Rowlands, S. R. (2009). Does a continuous feedback system improve psychotherapy outcome? Psychotherapy: Theory, Research, Practice, Training, 46, Richey, S., & Taylor, B. (2012, December 19). How representative are Amazon Mechanical Turk workers? Retrieved May 25, 2014, from Seidel, J. A. (2011). Rating of Outcome and Session Experience Scales (ROSES). (version 2.0). Denver, CO: Author. Available online at coloradopsychology.com Seidel, J. A., Miller, S. D., & Chow, D. L. (2014). Effect size calculations for the clinician: Methods and comparability. Psychotherapy Research, 24, Swift, J. K., Greenberg, R. P., Whipple, J. L., & Kominiak, N. (2012). Practice recommendations for reducing premature termination in therapy. Professional Psychology: Research and Practice, 43, Tarescavage, A. M., & Ben-Porath, Y. S. (2014). Psychotherapeutic outcomes measures: A critical review for practitioners. Journal of Clinical Psychology, 70, Vaz, S., Falkmer, T., Passmore, A. E., Parsons, R., & Andreou, P. (2013). The case for using the repeatability coefficient when calculating testretest reliability. PLoS ONE, 8(9), e journal.pone Wuyts, F. L., De Bodt, M. S., & Van de Heyning, P. H. (1999). Is the reliability of a visual analog scale higher than an ordinal scale? An experiment with the GRBAS scale for the perceptual evaluation of dysphonia. Journal of Voice, 13, S (99)80006-X (Appendix follows)

Calculating clinically significant change: Applications of the Clinical Global Impressions (CGI) Scale to evaluate client outcomes in private practice

Calculating clinically significant change: Applications of the Clinical Global Impressions (CGI) Scale to evaluate client outcomes in private practice University of Wollongong Research Online Faculty of Health and Behavioural Sciences - Papers (Archive) Faculty of Science, Medicine and Health 2010 Calculating clinically significant change: Applications

More information

The Practitioner Scholar: Journal of Counseling and Professional Psychology Volume 7, 2018

The Practitioner Scholar: Journal of Counseling and Professional Psychology Volume 7, 2018 144 Client Engagement Related to Their Satisfaction with Treatment Outcomes Huma A. Bashir, Wright State University Josephine F. Wilson, Wright State University and Greta H. Meyer, Mental Health and Recovery

More information

Using the WHO 5 Well-Being Index to Identify College Students at Risk for Mental Health Problems

Using the WHO 5 Well-Being Index to Identify College Students at Risk for Mental Health Problems Using the WHO 5 Well-Being Index to Identify College Students at Risk for Mental Health Problems Andrew Downs, Laura A. Boucher, Duncan G. Campbell, Anita Polyakov Journal of College Student Development,

More information

Psychometric Evaluation of the Brief Adjustment Scale-6 (BASE-6): A New Measure of General. Psychological Adjustment. Alexandra Paige Peterson

Psychometric Evaluation of the Brief Adjustment Scale-6 (BASE-6): A New Measure of General. Psychological Adjustment. Alexandra Paige Peterson Psychometric Evaluation of the Brief Adjustment Scale-6 (BASE-6): A New Measure of General Psychological Adjustment Alexandra Paige Peterson A thesis submitted in partial fulfillment of the requirements

More information

Outcomes of Client-Based Feedback: Comments and a third option

Outcomes of Client-Based Feedback: Comments and a third option Digital Commons @ George Fox University Faculty Publications - Graduate School of Clinical Psychology Graduate School of Clinical Psychology 4-1-2011 Outcomes of Client-Based Feedback: Comments and a third

More information

ADMS Sampling Technique and Survey Studies

ADMS Sampling Technique and Survey Studies Principles of Measurement Measurement As a way of understanding, evaluating, and differentiating characteristics Provides a mechanism to achieve precision in this understanding, the extent or quality As

More information

Reliability. Internal Reliability

Reliability. Internal Reliability 32 Reliability T he reliability of assessments like the DECA-I/T is defined as, the consistency of scores obtained by the same person when reexamined with the same test on different occasions, or with

More information

URL: <http://dx.doi.org/ / >

URL:  <http://dx.doi.org/ / > Citation: Murray, Aja Louise, McKenzie, Karen, Murray, Kara and Richelieu, Marc (2015) An analysis of the effectiveness of University counselling services. British Journal of Guidance and Counselling,

More information

Review of Various Instruments Used with an Adolescent Population. Michael J. Lambert

Review of Various Instruments Used with an Adolescent Population. Michael J. Lambert Review of Various Instruments Used with an Adolescent Population Michael J. Lambert Population. This analysis will focus on a population of adolescent youth between the ages of 11 and 20 years old. This

More information

Child Outcomes Research Consortium. Recommendations for using outcome measures

Child Outcomes Research Consortium. Recommendations for using outcome measures Child Outcomes Research Consortium Recommendations for using outcome measures Introduction The use of outcome measures is one of the most powerful tools available to children s mental health services.

More information

Examining the Relationship Between Therapist Perception of Client Progress and Treatment Outcomes

Examining the Relationship Between Therapist Perception of Client Progress and Treatment Outcomes Pacific University CommonKnowledge School of Graduate Psychology College of Health Professions 5-27-2010 Examining the Relationship Between Therapist Perception of Client Progress and Treatment Outcomes

More information

F2: ORS Scores for a single session sessions

F2: ORS Scores for a single session sessions Percent % of clients 1 Analysis of PGF ORS and SRS data 2011-2014. Report for the Problem Gambling Foundation July 2015 This analysis is based on data collected between 2011 and 2015. All clients in the

More information

Construct Validation of Direct Behavior Ratings: A Multitrait Multimethod Analysis

Construct Validation of Direct Behavior Ratings: A Multitrait Multimethod Analysis Construct Validation of Direct Behavior Ratings: A Multitrait Multimethod Analysis NASP Annual Convention 2014 Presenters: Dr. Faith Miller, NCSP Research Associate, University of Connecticut Daniel Cohen,

More information

Running head: CPPS REVIEW 1

Running head: CPPS REVIEW 1 Running head: CPPS REVIEW 1 Please use the following citation when referencing this work: McGill, R. J. (2013). Test review: Children s Psychological Processing Scale (CPPS). Journal of Psychoeducational

More information

Running Head: ADVERSE IMPACT. Significance Tests and Confidence Intervals for the Adverse Impact Ratio. Scott B. Morris

Running Head: ADVERSE IMPACT. Significance Tests and Confidence Intervals for the Adverse Impact Ratio. Scott B. Morris Running Head: ADVERSE IMPACT Significance Tests and Confidence Intervals for the Adverse Impact Ratio Scott B. Morris Illinois Institute of Technology Russell Lobsenz Federal Bureau of Investigation Adverse

More information

Ilpo Kuhlman. Kuopio Mental Health Services, Kuopio Psychiatric Center

Ilpo Kuhlman. Kuopio Mental Health Services, Kuopio Psychiatric Center Ilpo Kuhlman Kuopio Mental Health Services, Kuopio Psychiatric Center ilpo.kuhlman@kuopio.fi Turku 14.8.2014 Understanding the client s and the therapist s perceptions of the therapy and assessing symptom

More information

Lee N. Johnson a, Scott A. Ketring b & Shayne R. Anderson c a Department of Child and Family Development, University of

Lee N. Johnson a, Scott A. Ketring b & Shayne R. Anderson c a Department of Child and Family Development, University of This article was downloaded by: [Auburn University] On: 07 March 2012, At: 13:20 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer

More information

Chapter 9. Youth Counseling Impact Scale (YCIS)

Chapter 9. Youth Counseling Impact Scale (YCIS) Chapter 9 Youth Counseling Impact Scale (YCIS) Background Purpose The Youth Counseling Impact Scale (YCIS) is a measure of perceived effectiveness of a specific counseling session. In general, measures

More information

Impact and Evidence briefing

Impact and Evidence briefing Face to Face service Impact and Evidence briefing Interim Findings Face to Face is an NSPCC service that seeks to increase access to independent help and emotional support for looked after children and

More information

LISTENING TO CLIENTS, NOT LABELS

LISTENING TO CLIENTS, NOT LABELS 2/16/2015 LISTENING TO CLIENTS, NOT LABELS Outcome-Informed Practice with Children and Families JACQUELINE SPARKS, PH.D. UNIVERSITY OF RHODE ISLAND jsparks@uri.edu LISTENING TO CLIENTS, NOT LABELS Outcome-Informed

More information

Highlights of the Research Consortium 2002 Non-Clinical Sample Study

Highlights of the Research Consortium 2002 Non-Clinical Sample Study A RESEARCH REPORT OF THE RESEARCH CONSORTIUM OF COUNSELING & PSYCHOLOGICAL SERVICES IN HIGHER EDUCATION Highlights of the Research Consortium 2002 Non-Clinical Sample Study by Lisa K. Kearney and Augustine

More information

AN INVESTIGATION OF SELF-ASSESSMENT BIAS IN MENTAL HEALTH PROVIDERS 1

AN INVESTIGATION OF SELF-ASSESSMENT BIAS IN MENTAL HEALTH PROVIDERS 1 Psychological Reports, 2012, 110, 2, 639-644 Psychological Reports 2012 AN INVESTIGATION OF SELF-ASSESSMENT BIAS IN MENTAL HEALTH PROVIDERS 1 STEVEN WALFISH Independent Practice, Atlanta, Georgia BRIAN

More information

Life is a sum of all your choices. Albert Camus

Life is a sum of all your choices. Albert Camus Life is a sum of all your choices Albert Camus 40 min. PPT Presentation 15 min. Participants will apply ORS and SRS 10 min. Groups Conversation (questions supplied) 10 min. Sharing ideas from conversation

More information

The Wellness Assessment: Global Distress and Indicators of Clinical Severity May 2010

The Wellness Assessment: Global Distress and Indicators of Clinical Severity May 2010 The Wellness Assessment: Global Distress and Indicators of Clinical Severity May 2010 Background Research has shown that the integration of outcomes measurement into clinical practice is associated with

More information

Performance Improvement Project Implementation & Submission Tool

Performance Improvement Project Implementation & Submission Tool PLANNING TEMPLATE Performance Improvement Project Implementation & Submission Tool INTRODUCTION & INSTRUCTION This tool provides a structure for development and submission of Performance Improvement Projects

More information

Chapter 3. Psychometric Properties

Chapter 3. Psychometric Properties Chapter 3 Psychometric Properties Reliability The reliability of an assessment tool like the DECA-C is defined as, the consistency of scores obtained by the same person when reexamined with the same test

More information

Evidence for The Therapeutic Relationship as the Primary Agent of Change. Bob Werstlein PhD Daymark Recovery Services

Evidence for The Therapeutic Relationship as the Primary Agent of Change. Bob Werstlein PhD Daymark Recovery Services Evidence for The Therapeutic Relationship as the Primary Agent of Change Bob Werstlein PhD Daymark Recovery Services Factors that Account for Successful Outcomes Client/Extra Therapeutic- 40% Relationship-30%

More information

Manual Supplement. Posttraumatic Stress Disorder Checklist (PCL)

Manual Supplement. Posttraumatic Stress Disorder Checklist (PCL) Manual Supplement V OLUME 1, I SSUE 1 N OVEMBER 18, 2014 Posttraumatic Stress Disorder Checklist (PCL) The Posttraumatic Stress Disorder Checklist (PCL) is one of the most frequently used standardized

More information

Lecture 4: Evidence-based Practice: Beyond Colorado

Lecture 4: Evidence-based Practice: Beyond Colorado Lecture 4: Evidence-based Practice: Beyond Colorado A. Does Psychotherapy Work? Many of you are in the course because you want to enter the helping professions to offer some form of psychotherapy to heal

More information

Measuring Perceived Social Support in Mexican American Youth: Psychometric Properties of the Multidimensional Scale of Perceived Social Support

Measuring Perceived Social Support in Mexican American Youth: Psychometric Properties of the Multidimensional Scale of Perceived Social Support Marquette University e-publications@marquette College of Education Faculty Research and Publications Education, College of 5-1-2004 Measuring Perceived Social Support in Mexican American Youth: Psychometric

More information

Focus of Today s Presentation. Partners in Healing Model. Partners in Healing: Background. Data Collection Tools. Research Design

Focus of Today s Presentation. Partners in Healing Model. Partners in Healing: Background. Data Collection Tools. Research Design Exploring the Impact of Delivering Mental Health Services in NYC After-School Programs Gerald Landsberg, DSW, MPA Stephanie-Smith Waterman, MSW, MS Ana Maria Pinter, M.A. Focus of Today s Presentation

More information

Do we Know When our Clients Get Worse? An Investigation of Therapists Ability to Detect Negative Client Change

Do we Know When our Clients Get Worse? An Investigation of Therapists Ability to Detect Negative Client Change Clinical Psychology and Psychotherapy Clin. Psychol. Psychother. 17, 25 32 (2010) Published online 13 November 2009 in Wiley InterScience (www.interscience.wiley.com)..656 Do we Know When our Clients Get

More information

Book review. Conners Adult ADHD Rating Scales (CAARS). By C.K. Conners, D. Erhardt, M.A. Sparrow. New York: Multihealth Systems, Inc.

Book review. Conners Adult ADHD Rating Scales (CAARS). By C.K. Conners, D. Erhardt, M.A. Sparrow. New York: Multihealth Systems, Inc. Archives of Clinical Neuropsychology 18 (2003) 431 437 Book review Conners Adult ADHD Rating Scales (CAARS). By C.K. Conners, D. Erhardt, M.A. Sparrow. New York: Multihealth Systems, Inc., 1999 1. Test

More information

Khiron House. Annual CORE Outcomes Report 2016

Khiron House. Annual CORE Outcomes Report 2016 Khiron House Annual CORE Outcomes Report 2016 Independent Data report prepared by CORE IMS August 2017 A report for clients recorded in CORE Net with a Closed status and a last session date between 1 st

More information

Test review. Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., Test description

Test review. Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., Test description Archives of Clinical Neuropsychology 19 (2004) 703 708 Test review Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., 2002 1. Test description The Trail Making Test

More information

Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria

Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill

More information

Responsiveness in psychotherapy research and practice. William B. Stiles Jyväskylä, Finland, February 2018

Responsiveness in psychotherapy research and practice. William B. Stiles Jyväskylä, Finland, February 2018 Responsiveness in psychotherapy research and practice William B. Stiles Jyväskylä, Finland, February 2018 Responsiveness Behavior influenced by emerging context (i.e. by new events),

More information

Validity and reliability of a 36-item problemrelated distress screening tool in a community sample of 319 cancer survivors

Validity and reliability of a 36-item problemrelated distress screening tool in a community sample of 319 cancer survivors Validity and reliability of a 36-item problemrelated distress screening tool in a community sample of 319 cancer survivors Melissa Miller 1, Joanne Buzaglo 1, Kasey Dougherty 1, Vicki Kennedy 1, Julie

More information

PÄIVI KARHU THE THEORY OF MEASUREMENT

PÄIVI KARHU THE THEORY OF MEASUREMENT PÄIVI KARHU THE THEORY OF MEASUREMENT AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition

More information

Commentary on Beutler et al. s Common, Specific, and Treatment Fit Variables in Psychotherapy Outcome

Commentary on Beutler et al. s Common, Specific, and Treatment Fit Variables in Psychotherapy Outcome Commentary on Beutler et al. s Common, Specific, and Treatment Fit Variables in Psychotherapy Outcome Kenneth N. Levy Pennsylvania State University In this commentary, the author highlights innovative

More information

TELEPSYCHOLOGY FOR THE PSYCHOLOGIST IN PRIVATE PRACTICE

TELEPSYCHOLOGY FOR THE PSYCHOLOGIST IN PRIVATE PRACTICE TELEPSYCHOLOGY FOR THE PSYCHOLOGIST IN PRIVATE PRACTICE Dr. Madalina Sucala Icahn School of Medicine at Mount Sinai Department of Oncological Sciences OUTLINE Video conferencing as a delivery method Assessment

More information

Multi-Dimensional Family Therapy. Full Service Partnership Outcomes Report

Multi-Dimensional Family Therapy. Full Service Partnership Outcomes Report MHSA Multi-Dimensional Family Therapy Full Service Partnership Outcomes Report TABLE OF CONTENTS Enrollment 5 Discontinuance 5 Demographics 6-7 Length of Stay 8 Outcomes 9-11 Youth Outcome Questionnaire

More information

DESIGN TYPE AND LEVEL OF EVIDENCE: Randomized controlled trial, Level I

DESIGN TYPE AND LEVEL OF EVIDENCE: Randomized controlled trial, Level I CRITICALLY APPRAISED PAPER (CAP) Hasan, A. A., Callaghan, P., & Lymn, J. S. (2015). Evaluation of the impact of a psychoeducational intervention for people diagnosed with schizophrenia and their primary

More information

Hubley Depression Scale for Older Adults (HDS-OA): Reliability, Validity, and a Comparison to the Geriatric Depression Scale

Hubley Depression Scale for Older Adults (HDS-OA): Reliability, Validity, and a Comparison to the Geriatric Depression Scale The University of British Columbia Hubley Depression Scale for Older Adults (HDS-OA): Reliability, Validity, and a Comparison to the Geriatric Depression Scale Sherrie L. Myers & Anita M. Hubley University

More information

Cognitive-Behavioral Assessment of Depression: Clinical Validation of the Automatic Thoughts Questionnaire

Cognitive-Behavioral Assessment of Depression: Clinical Validation of the Automatic Thoughts Questionnaire Journal of Consulting and Clinical Psychology 1983, Vol. 51, No. 5, 721-725 Copyright 1983 by the American Psychological Association, Inc. Cognitive-Behavioral Assessment of Depression: Clinical Validation

More information

Examining the Psychometric Properties of The McQuaig Occupational Test

Examining the Psychometric Properties of The McQuaig Occupational Test Examining the Psychometric Properties of The McQuaig Occupational Test Prepared for: The McQuaig Institute of Executive Development Ltd., Toronto, Canada Prepared by: Henryk Krajewski, Ph.D., Senior Consultant,

More information

Florida State University Libraries

Florida State University Libraries Florida State University Libraries 2016 The Effect of Brief Staff-Assisted Career Service Delivery on Drop-In Clients Debra S. Osborn, Seth W. Hayden, Gary W. Peterson and James P. Sampson Follow this

More information

The reliability and validity of the Index of Complexity, Outcome and Need for determining treatment need in Dutch orthodontic practice

The reliability and validity of the Index of Complexity, Outcome and Need for determining treatment need in Dutch orthodontic practice European Journal of Orthodontics 28 (2006) 58 64 doi:10.1093/ejo/cji085 Advance Access publication 8 November 2005 The Author 2005. Published by Oxford University Press on behalf of the European Orthodontics

More information

CRITICALLY APPRAISED PAPER (CAP) FOCUSED QUESTION

CRITICALLY APPRAISED PAPER (CAP) FOCUSED QUESTION CRITICALLY APPRAISED PAPER (CAP) FOCUSED QUESTION Does occupational therapy have a role in assisting adolescents and young adults with difficulties that arise due to psychological illnesses as a part of

More information

Confirmatory Factor Analysis of the BCSSE Scales

Confirmatory Factor Analysis of the BCSSE Scales Confirmatory Factor Analysis of the BCSSE Scales Justin Paulsen, ABD James Cole, PhD January 2019 Indiana University Center for Postsecondary Research 1900 East 10th Street, Suite 419 Bloomington, Indiana

More information

The Leeds Reliable Change Indicator

The Leeds Reliable Change Indicator The Leeds Reliable Change Indicator Simple Excel (tm) applications for the analysis of individual patient and group data Stephen Morley and Clare Dowzer University of Leeds Cite as: Morley, S., Dowzer,

More information

Development of a self-reported Chronic Respiratory Questionnaire (CRQ-SR)

Development of a self-reported Chronic Respiratory Questionnaire (CRQ-SR) 954 Department of Respiratory Medicine, University Hospitals of Leicester, Glenfield Hospital, Leicester LE3 9QP, UK J E A Williams S J Singh L Sewell M D L Morgan Department of Clinical Epidemiology and

More information

ACDI. An Inventory of Scientific Findings. (ACDI, ACDI-Corrections Version and ACDI-Corrections Version II) Provided by:

ACDI. An Inventory of Scientific Findings. (ACDI, ACDI-Corrections Version and ACDI-Corrections Version II) Provided by: + ACDI An Inventory of Scientific Findings (ACDI, ACDI-Corrections Version and ACDI-Corrections Version II) Provided by: Behavior Data Systems, Ltd. P.O. Box 44256 Phoenix, Arizona 85064-4256 Telephone:

More information

Improving clinicians' attitudes toward providing feedback on routine outcome assessments

Improving clinicians' attitudes toward providing feedback on routine outcome assessments University of Wollongong Research Online Faculty of Health and Behavioural Sciences - Papers (Archive) Faculty of Science, Medicine and Health 2009 Improving clinicians' attitudes toward providing feedback

More information

alternate-form reliability The degree to which two or more versions of the same test correlate with one another. In clinical studies in which a given function is going to be tested more than once over

More information

Feedback Informed Treatment: David Nylund, LCSW, PhD Alex Filippelli, BSW

Feedback Informed Treatment: David Nylund, LCSW, PhD Alex Filippelli, BSW Feedback Informed Treatment: David Nylund, LCSW, PhD Alex Filippelli, BSW www.centerforclinicalexcellence.com http://twitter.com/scott_dm http://www.linkedin.com/in/scottdmphd Worldwide Trends in Behavioral

More information

The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign

The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign The Youth Experience Survey 2.0: Instrument Revisions and Validity Testing* David M. Hansen 1 University of Illinois, Urbana-Champaign Reed Larson 2 University of Illinois, Urbana-Champaign February 28,

More information

PSYCHOMETRIC PROPERTIES OF CLINICAL PERFORMANCE RATINGS

PSYCHOMETRIC PROPERTIES OF CLINICAL PERFORMANCE RATINGS PSYCHOMETRIC PROPERTIES OF CLINICAL PERFORMANCE RATINGS A total of 7931 ratings of 482 third- and fourth-year medical students were gathered over twelve four-week periods. Ratings were made by multiple

More information

Cognitive Therapy for Suicide Prevention

Cognitive Therapy for Suicide Prevention This program description was created for SAMHSA s National Registry for Evidence-based Programs and Practices (NREPP). Please note that SAMHSA has discontinued the NREPP program and these program descriptions

More information

The Discovering Diversity Profile Research Report

The Discovering Diversity Profile Research Report The Discovering Diversity Profile Research Report The Discovering Diversity Profile Research Report Item Number: O-198 1994 by Inscape Publishing, Inc. All rights reserved. Copyright secured in the US

More information

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971) Ch. 5: Validity Validity History Griggs v. Duke Power Ricci vs. DeStefano Defining Validity Aspects of Validity Face Validity Content Validity Criterion Validity Construct Validity Reliability vs. Validity

More information

Dynamic Deconstructive Psychotherapy

Dynamic Deconstructive Psychotherapy This program description was created for SAMHSA s National Registry for Evidence-based Programs and Practices (NREPP). Please note that SAMHSA has discontinued the NREPP program and these program descriptions

More information

INTEGRATING REALISTIC RESEARCH INTO EVERY DAY PRACTICE

INTEGRATING REALISTIC RESEARCH INTO EVERY DAY PRACTICE INTEGRATING REALISTIC RESEARCH INTO EVERY DAY PRACTICE Professor Nigel Beail Consultant & Professional Lead for Psychological Services. South West Yorkshire Partnership NHS Foundation Trust & Clinical

More information

Convergent Validity of a Single Question with Multiple Classification Options for Depression Screening in Medical Settings

Convergent Validity of a Single Question with Multiple Classification Options for Depression Screening in Medical Settings DOI 10.7603/s40790-014-0001-8 Convergent Validity of a Single Question with Multiple Classification Options for Depression Screening in Medical Settings H. Edward Fouty, Hanny C. Sanchez, Daniel S. Weitzner,

More information

Gezinskenmerken: De constructie van de Vragenlijst Gezinskenmerken (VGK) Klijn, W.J.L.

Gezinskenmerken: De constructie van de Vragenlijst Gezinskenmerken (VGK) Klijn, W.J.L. UvA-DARE (Digital Academic Repository) Gezinskenmerken: De constructie van de Vragenlijst Gezinskenmerken (VGK) Klijn, W.J.L. Link to publication Citation for published version (APA): Klijn, W. J. L. (2013).

More information

Cognitive styles sex the brain, compete neurally, and quantify deficits in autism

Cognitive styles sex the brain, compete neurally, and quantify deficits in autism Cognitive styles sex the brain, compete neurally, and quantify deficits in autism Nigel Goldenfeld 1, Sally Wheelwright 2 and Simon Baron-Cohen 2 1 Department of Applied Mathematics and Theoretical Physics,

More information

Making a psychometric. Dr Benjamin Cowan- Lecture 9

Making a psychometric. Dr Benjamin Cowan- Lecture 9 Making a psychometric Dr Benjamin Cowan- Lecture 9 What this lecture will cover What is a questionnaire? Development of questionnaires Item development Scale options Scale reliability & validity Factor

More information

PTHP 7101 Research 1 Chapter Assignments

PTHP 7101 Research 1 Chapter Assignments PTHP 7101 Research 1 Chapter Assignments INSTRUCTIONS: Go over the questions/pointers pertaining to the chapters and turn in a hard copy of your answers at the beginning of class (on the day that it is

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

Critical Thinking Assessment at MCC. How are we doing?

Critical Thinking Assessment at MCC. How are we doing? Critical Thinking Assessment at MCC How are we doing? Prepared by Maura McCool, M.S. Office of Research, Evaluation and Assessment Metropolitan Community Colleges Fall 2003 1 General Education Assessment

More information

Reliability and Validity of the Pediatric Quality of Life Inventory Generic Core Scales, Multidimensional Fatigue Scale, and Cancer Module

Reliability and Validity of the Pediatric Quality of Life Inventory Generic Core Scales, Multidimensional Fatigue Scale, and Cancer Module 2090 The PedsQL in Pediatric Cancer Reliability and Validity of the Pediatric Quality of Life Inventory Generic Core Scales, Multidimensional Fatigue Scale, and Cancer Module James W. Varni, Ph.D. 1,2

More information

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION

MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION MBA 605 Business Analytics Don Conant, PhD. GETTING TO THE STANDARD NORMAL DISTRIBUTION Variables In the social sciences data are the observed and/or measured characteristics of individuals and groups

More information

Dose-Response Studies in Psychotherapy

Dose-Response Studies in Psychotherapy Dose-Response Studies in Psychotherapy Howard, K. I., Kopta, S. M., Krause, M. S., & Orlinsky, D. E. (1986). The dose-effect relationship in psychotherapy. The American Psychologist, 41, 159 164. Investigated

More information

The Bilevel Structure of the Outcome Questionnaire 45

The Bilevel Structure of the Outcome Questionnaire 45 Psychological Assessment 2010 American Psychological Association 2010, Vol. 22, No. 2, 350 355 1040-3590/10/$12.00 DOI: 10.1037/a0019187 The Bilevel Structure of the Outcome Questionnaire 45 Jamie L. Bludworth,

More information

DATA GATHERING METHOD

DATA GATHERING METHOD DATA GATHERING METHOD Dr. Sevil Hakimi Msm. PhD. THE NECESSITY OF INSTRUMENTS DEVELOPMENT Good researches in health sciences depends on good measurement. The foundation of all rigorous research designs

More information

ANXIETY A brief guide to the PROMIS Anxiety instruments:

ANXIETY A brief guide to the PROMIS Anxiety instruments: ANXIETY A brief guide to the PROMIS Anxiety instruments: ADULT PEDIATRIC PARENT PROXY PROMIS Pediatric Bank v1.0 Anxiety PROMIS Pediatric Short Form v1.0 - Anxiety 8a PROMIS Item Bank v1.0 Anxiety PROMIS

More information

Instrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC)

Instrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Instrument equivalence across ethnic groups Antonio Olmos (MHCD) Susan R. Hutchinson (UNC) Overview Instrument Equivalence Measurement Invariance Invariance in Reliability Scores Factorial Invariance Item

More information

Gambling Decision making Assessment Validity

Gambling Decision making Assessment Validity J Gambl Stud (2010) 26:639 644 DOI 10.1007/s10899-010-9189-x ORIGINAL PAPER Comparing the Utility of a Modified Diagnostic Interview for Gambling Severity (DIGS) with the South Oaks Gambling Screen (SOGS)

More information

A Coding System to Measure Elements of Shared Decision Making During Psychiatric Visits

A Coding System to Measure Elements of Shared Decision Making During Psychiatric Visits Measuring Shared Decision Making -- 1 A Coding System to Measure Elements of Shared Decision Making During Psychiatric Visits Michelle P. Salyers, Ph.D. 1481 W. 10 th Street Indianapolis, IN 46202 mpsalyer@iupui.edu

More information

Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures

Reliability and validity of the International Spinal Cord Injury Basic Pain Data Set items as self-report measures (2010) 48, 230 238 & 2010 International Society All rights reserved 1362-4393/10 $32.00 www.nature.com/sc ORIGINAL ARTICLE Reliability and validity of the International Injury Basic Pain Data Set items

More information

Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children

Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Dr. KAMALPREET RAKHRA MD MPH PhD(Candidate) No conflict of interest Child Behavioural Check

More information

Prospective Validation of Clinically Important Changes in Pain Severity Measured on a Visual Analog Scale

Prospective Validation of Clinically Important Changes in Pain Severity Measured on a Visual Analog Scale ORIGINAL CONTRIBUTION Prospective Validation of Clinically Important Changes in Pain Severity Measured on a Visual Analog Scale From the Department of Emergency Medicine, Albert Einstein College of Medicine,

More information

Practice-Based Research for the Psychotherapist: Research Instruments & Strategies Robert Elliott University of Strathclyde

Practice-Based Research for the Psychotherapist: Research Instruments & Strategies Robert Elliott University of Strathclyde Practice-Based Research for the Psychotherapist: Research Instruments & Strategies Robert Elliott University of Strathclyde Part 1: Outcome Monitoring: CORE Outcome Measure 1 A Psychological Ruler Example

More information

An International Study of the Reliability and Validity of Leadership/Impact (L/I)

An International Study of the Reliability and Validity of Leadership/Impact (L/I) An International Study of the Reliability and Validity of Leadership/Impact (L/I) Janet L. Szumal, Ph.D. Human Synergistics/Center for Applied Research, Inc. Contents Introduction...3 Overview of L/I...5

More information

Chapter 2. Test Development Procedures: Psychometric Study and Analytical Plan

Chapter 2. Test Development Procedures: Psychometric Study and Analytical Plan Chapter 2 Test Development Procedures: Psychometric Study and Analytical Plan Introduction The Peabody Treatment Progress Battery (PTPB) is designed to provide clinically relevant measures of key mental

More information

School orientation and mobility specialists School psychologists School social workers Speech language pathologists

School orientation and mobility specialists School psychologists School social workers Speech language pathologists 2013-14 Pilot Report Senate Bill 10-191, passed in 2010, restructured the way all licensed personnel in schools are supported and evaluated in Colorado. The ultimate goal is ensuring college and career

More information

Bowen, Alana (2011) The role of disclosure and resilience in response to stress and trauma. PhD thesis, James Cook University.

Bowen, Alana (2011) The role of disclosure and resilience in response to stress and trauma. PhD thesis, James Cook University. ResearchOnline@JCU This file is part of the following reference: Bowen, Alana (2011) The role of disclosure and resilience in response to stress and trauma. PhD thesis, James Cook University. Access to

More information

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors

LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors affecting reliability ON DEFINING RELIABILITY Non-technical

More information

The Bengali Adaptation of Edinburgh Postnatal Depression Scale

The Bengali Adaptation of Edinburgh Postnatal Depression Scale IOSR Journal of Nursing and Health Science (IOSR-JNHS) e-issn: 2320 1959.p- ISSN: 2320 1940 Volume 4, Issue 2 Ver. I (Mar.-Apr. 2015), PP 12-16 www.iosrjournals.org The Bengali Adaptation of Edinburgh

More information

Administering and Scoring the CSQ Scales

Administering and Scoring the CSQ Scales CSQ Scales 2012 All Rights Reserved Administering and Scoring the CSQ Scales Inquiries: Web: info@csqscales.com www.csqscales.com Copyright of the CSQ Scales Copyright:, Tamalpais Matrix Systems, 35 Miller

More information

11-3. Learning Objectives

11-3. Learning Objectives 11-1 Measurement Learning Objectives 11-3 Understand... The distinction between measuring objects, properties, and indicants of properties. The similarities and differences between the four scale types

More information

Interpreting change on the WAIS-III/WMS-III in clinical samples

Interpreting change on the WAIS-III/WMS-III in clinical samples Archives of Clinical Neuropsychology 16 (2001) 183±191 Interpreting change on the WAIS-III/WMS-III in clinical samples Grant L. Iverson* Department of Psychiatry, University of British Columbia, 2255 Wesbrook

More information

Table S1. Search terms applied to electronic databases. The African Journal Archive African Journals Online. depression OR distress

Table S1. Search terms applied to electronic databases. The African Journal Archive African Journals Online. depression OR distress Supplemental Digital Content to accompany: [authors]. Reliability and validity of depression assessment among persons with HIV in sub-saharan Africa: systematic review and metaanalysis. J Acquir Immune

More information

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971)

Validity. Ch. 5: Validity. Griggs v. Duke Power - 2. Griggs v. Duke Power (1971) Ch. 5: Validity Validity History Griggs v. Duke Power Ricci vs. DeStefano Defining Validity Aspects of Validity Face Validity Content Validity Criterion Validity Construct Validity Reliability vs. Validity

More information

Assessing the Validity and Reliability of the Teacher Keys Effectiveness. System (TKES) and the Leader Keys Effectiveness System (LKES)

Assessing the Validity and Reliability of the Teacher Keys Effectiveness. System (TKES) and the Leader Keys Effectiveness System (LKES) Assessing the Validity and Reliability of the Teacher Keys Effectiveness System (TKES) and the Leader Keys Effectiveness System (LKES) of the Georgia Department of Education Submitted by The Georgia Center

More information

Questionnaire on Anticipated Discrimination (QUAD)(1): is a self-complete measure comprising 14 items

Questionnaire on Anticipated Discrimination (QUAD)(1): is a self-complete measure comprising 14 items Online Supplement Data Supplement for Clement et al. (10.1176/appi.ps.201300448) Details of additional measures included in the analysis Questionnaire on Anticipated Discrimination (QUAD)(1): is a self-complete

More information

Using Measurement-Based Care to Enhance Substance Abuse and Mental Health Treatment

Using Measurement-Based Care to Enhance Substance Abuse and Mental Health Treatment Using Measurement-Based Care to Enhance Substance Abuse and Mental Health Treatment TONMOY SHARMA MBBS, MSC CHIEF EXECUTIVE OFFICER SOVEREIGN HEALTH GROUP The Problem 5 to 10% of adults and 14 to 25% of

More information

The moderating impact of temporal separation on the association between intention and physical activity: a meta-analysis

The moderating impact of temporal separation on the association between intention and physical activity: a meta-analysis PSYCHOLOGY, HEALTH & MEDICINE, 2016 VOL. 21, NO. 5, 625 631 http://dx.doi.org/10.1080/13548506.2015.1080371 The moderating impact of temporal separation on the association between intention and physical

More information

CHAPTER 2 CRITERION VALIDITY OF AN ATTENTION- DEFICIT/HYPERACTIVITY DISORDER (ADHD) SCREENING LIST FOR SCREENING ADHD IN OLDER ADULTS AGED YEARS

CHAPTER 2 CRITERION VALIDITY OF AN ATTENTION- DEFICIT/HYPERACTIVITY DISORDER (ADHD) SCREENING LIST FOR SCREENING ADHD IN OLDER ADULTS AGED YEARS CHAPTER 2 CRITERION VALIDITY OF AN ATTENTION- DEFICIT/HYPERACTIVITY DISORDER (ADHD) SCREENING LIST FOR SCREENING ADHD IN OLDER ADULTS AGED 60 94 YEARS AM. J. GERIATR. PSYCHIATRY. 2013;21(7):631 635 DOI:

More information