Interpreting change on the WAIS-III/WMS-III in clinical samples

Similar documents
Concurrent validity of WAIS-III short forms in a geriatric sample with suspected dementia: Verbal, performance and full scale IQ scores

PRELIMINARY NORMS FOR YEAR OLDS ON THE MEMORY TEST FOR OLDER ADULTS (MTOA:S) ABSTRACT

Rapidly-administered short forms of the Wechsler Adult Intelligence Scale 3rd edition

Score Report. Test Administered WMS IV (3/16/2009) Age at Testing 24 years 0 months Retest? No

Improving the Methodology for Assessing Mild Cognitive Impairment Across the Lifespan

WISC-IV Profiles in Children with Autism Spectrum Disorder. Karen Stack, Dr. Raegan Murphy, Paula Prendeville and Dr.

Minimizing Misdiagnosis: Psychometric Criteria for Possible or Probable Memory Impairment

M P---- Ph.D. Clinical Psychologist / Neuropsychologist

Comparison of Predicted-difference, Simple-difference, and Premorbid-estimation methodologies for evaluating IQ and memory score discrepancies

(Received 30 March 1990)

NEUROPSYCHOLOGICAL ASSESSMENT S A R A H R A S K I N, P H D, A B P P S A R A H B U L L A R D, P H D, A B P P

Process of a neuropsychological assessment

One-Month Test Retest Reliability of the ImPACT Test Battery

How do we construct Intelligence tests? Tests must be: Standardized Reliable Valid

Hubley Depression Scale for Older Adults (HDS-OA): Reliability, Validity, and a Comparison to the Geriatric Depression Scale

Robust Cognitive Change

The merits of mental age as an additional measure of intellectual ability in the low ability range. Simon Whitaker

Are people with Intellectual disabilities getting more or less intelligent II: US data. Simon Whitaker

Effects of severe depression on TOMM performance among disability-seeking outpatients

Test Assessment Description Ref. Global Deterioration Rating Scale Dementia severity Rating scale of dementia stages (2) (4) delayed recognition

Base Rates of Impaired Neuropsychological Test Performance Among Healthy Older Adults

Replication of factor structure of Wechsler Adult Intelligence Scale-III Chinese version in Chinese mainland non-clinical and schizophrenia samples

Use a diagnostic neuropsychology HOW TO DO IT PRACTICAL NEUROLOGY

WAIS-III WMS-III Discrepancy Analysis: Six-Factor Model Index Discrepancy Base Rates, Implications, and a Preliminary Consideration of Utility

Clinical Utility of Wechsler Memory Scale-Revised and Predicted IQ Discrepancies in Closed Head Injury

A confirmatory factor analysis of the WMS-III in a clinical sample with crossvalidation in the standardization sample

NEUROPSYCHOLOGICAL ASSESSMENT

This version was downloaded from Northumbria Research Link:

Cognitive recovery after severe head injury 2. Wechsler Adult Intelligence Scale during post-traumatic amnesia

Healthy Children Get Low Scores Too: Prevalence of Low Scores on the NEPSY-II in Preschoolers, Children, and Adolescents

CLINICAL NEUROPSYCHOLOGY PSYC32

University of Huddersfield Repository

TOPF (Test of Pre-Morbid Function)

Early intervention in Bipolar Disorder

University of Huddersfield Repository

NEUROBEHAVIORAL EVALUATION OF HOUSEHOLD EXPOSURE TO DURSBAN 1

***This is a self-archiving copy and does not fully replicate the published version*** Auditory Temporal Processes in the Elderly

Elderly Norms for the Hopkins Verbal Learning Test-Revised*

The standard error in the Jacobson and Truax Reliable Change Index: The classical approach to the assessment of reliable change

APPENDIX A TASK DEVELOPMENT AND NORMATIVE DATA

University of Huddersfield Repository

Using contextual analysis to investigate the nature of spatial memory

Book review. Conners Adult ADHD Rating Scales (CAARS). By C.K. Conners, D. Erhardt, M.A. Sparrow. New York: Multihealth Systems, Inc.

WAIS-IV Seven-Subtest Short Form: Validity and Clinical Use in Schizophrenia

Test-retest reliable coefficients and 5-year change scores for the MMSE and 3MS

Neuropsychological Evaluation of

Error in the estimation of intellectual ability in the low range using the WISC-IV and WAIS- III By. Simon Whitaker

Using Neuropsychological Experts. Elizabeth L. Leonard, PhD

Beyond the Psychologist s Report. Nancy Foster, PhD Institute for Brain-Behavior Integration

OLDER ADULTS LEARNING, MEMORY, AND COPY PERFORMANCE ON THE REY-OSTERRIETH AND MODIFIED TAYLOR COMPLEX FIGURES

The Repeatable Battery for the Assessment of Neuropsychological Status Effort Scale

AN INCREASE OF INTELLIGENCE IN SUDAN,


SUPPLEMENTARY MATERIAL DOMAIN-SPECIFIC COGNITIVE IMPAIRMENT IN PATIENTS WITH COPD AND CONTROL SUBJECTS

Evidence for early impairment of verbal intelligence

A semantic verbal fluency test for English- and Spanish-speaking older Mexican-Americans

MindmetriQ. Technical Fact Sheet. v1.0 MindmetriQ

Reading Based IQ Estimates and Actual Premorbid Cognitive Performance: Discrepancies in a College Athlete Sample

The Five-Point Test: Reliability, Validity and Normative Data for Children and Adults

The Association Between Measures of Intelligence and Memory in a Clinical Sample

February 8, Prepared By: Glen M. Doniger, PhD Director of Scientific Development NeuroTrax Corporation

Method. NeuRA Schizophrenia and bipolar disorder April 2016

Introduction to Psychology. Lecture 34

Memory Retraining with Adult Male Alcoholics

Correlation Between Intelligence Test Scores and Executive Function Measures

William C Miller, PhD, FCAOT Professor Occupational Science & Occupational Therapy University of British Columbia Vancouver, BC, Canada

An empirical analysis of the BASC Frontal Lobe/Executive Control scale with a clinical sample

NeuRA Decision making April 2016

Chapter 31 Psychometric Foundations for the Interpretation of Neuropsychological Test Results *

IN 2003, AN UPDATED version of the Wechsler

MMPI-2 short form proposal: CAUTION

Applied Short-Form WAIS-III to Explore Global Cognitive Profile of the Patients with Schizophrenia

Trail making test A 2,3. Memory Logical memory Story A delayed recall 4,5. Rey auditory verbal learning test (RAVLT) 2,6

Forensic Psychology Information Pack

The secular increase in test scores is a ``Jensen e ect''

Effects of Pesticide Exposure on Learning and Development

Differential Effects of Test Anxiety & Stress on the WAIS-IV

Table 1: Summary of measures of cognitive fatigability operationalised in existing research.

Personality and Individual Differences

Utility of a clinically derived abbreviated form of the WAIS-III

COGNITIVE FACTORS IN EPILEPSY BY MARGARET DAVIES-EYSENCK

Serial 7s and Alphabet Backwards as Brief Measures of Information Processing Speed

Meniere s Disease Case. Suzanne Beason-Hazen, Ph.D.

Basic Psychometrics for the Practicing Psychologist Presented by Yossef S. Ben-Porath, PhD, ABPP

AGE-RELATED DIFFERENCES IN VERBAL, VISUAL. AND SPATIAL MEMORY: The Same or Different?

Verbal Reasoning: Technical information

Neuropsychology of Attention Deficit Hyperactivity Disorder (ADHD)

DEVICES AND TECHNOLOGY

Measurement and Classification of Neurocognitive Disability in HIV/AIDS Robert K. Heaton Ph.D University of California San Diego Ancient History

COGMED CLINICAL EVALUATION SERIES

The use of Autism Mental Status Exam in an Italian sample. A brief report

Carmen Inoa Vazquez, Ph.D., ABPP Clinical Professor NYU School of Medicine Lead Litigation Conference Philadelphia May 19, 2009 Presentation

21/05/2018. Today s webinar will answer. Presented by: Valorie O Keefe Consultant Psychologist

10/5/2015. Advances in Pediatric Neuropsychology Test Interpretation Part I: Importance of Considering Normal Variability. Financial Disclosures

The Short NART: Cross-validation, relationship to IQ and some practical considerations

Comparable Consistency, Coherence, and Commonality of Measures. of Cognitive Functioning across Adulthood

Technical Report #2 Testing Children Who Are Deaf or Hard of Hearing

Test review. Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., Test description

Repeatable Battery for the Assessment of Neuropsychological Status as a Screening Test in Schizophrenia, I: Sensitivity, Reliability, and Validity

COMPARISON OF THE BECK DEPRESSION INVENTORY-II AND GERIATRIC DEPRESSION SCALE AS SCREENS FOR DEPRESSION IN CARDIAC PATIENTS

Transcription:

Archives of Clinical Neuropsychology 16 (2001) 183±191 Interpreting change on the WAIS-III/WMS-III in clinical samples Grant L. Iverson* Department of Psychiatry, University of British Columbia, 2255 Wesbrook Mall, Vancouver, BC, V6T 2A1, Canada Neuropsychiatry Units, Riverview Hospital, Canada Accepted 19 February 2000 Abstract Clinicians should note that there is considerable variability in the reliabilities of the index and subtest scores derived from the third editions of the Wechsler Adult Intelligence Scale (WAIS-III) and the Wechsler Memory Scale (WMS-III). The purpose of this article is to review these reliabilites and to illustrate how they can be used to interpret change in patients' performances from test to retest. The WAIS-III IQ and Index scores are consistently the most reliable scores, in terms of both internal consistency and test±retest reliability. The most internally consistent WAIS-III subtests are Vocabulary, Information, Digit Span, Matrix Reasoning, and Arithmetic. Information and Vocabulary have the highest test ±retest reliability. On the WMS-III, the Auditory Immediate Index, Immediate Memory Index, Auditory Delayed Index, and General Memory Index are the most reliable, in terms of both internal consistency and test ± retest reliability. The Logical Memory I and Verbal Paired Associates I subtests are the most reliable. Data from three clinical groups (i.e., Alzheimer's disease, chronic alcohol abuse, and schizophrenia) were extracted from the Technical Manual [Psychological Corporation (1997). WAIS-III/WMS-III Technical Manual. San Antonio: Harcourt Brace] for the purpose of calculating reliable change estimates. A table of confidence intervals for test ± retest measurement error is provided to help the clinician determine if patients have reliably improved or deteriorated on follow-up testing. D 2001 National Academy of Neuropsychology. Published by Elsevier Science Ltd. Keywords: WAIS-III; WMS-III; Clinical samples * Department of Psychiatry, University of British Columbia, 2255 Wesbrook Mall, Vancouver, B.C., Canada V6T 2A1. Tel.: +1-604-822-7588; fax: +1-604-822-7756. 0887-6177/01/$ ± see front matter D 2001 National Academy of Neuropsychology. PII: S0887-6177(00)00060-3

184 G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 At this point in time, based on available data, rehabilitation psychologists and neuropsychologists should have the most confidence in the index scores derived from the third editions of the Wechsler Adult Intelligence Scale (WAIS-III) and the Wechsler Memory Scale (WMS- III). With some notable exceptions, many of the subtests, especially those from the WMS-III, have reliabilities that limit their clinical usefulness as independent measures of specific cognitive abilities. This is a common problem; many tests used in clinical neuropsychology suffer from low reliability. Reviewing test reliabilities should not be considered esoteric, or be relegated to academic discussions of psychometric theory. Rather, these reliabilities have two very practical implications for our day-to-day clinical evaluations. First, do the items that comprise a test systematically measure a unified cognitive ability (e.g., expressive vocabulary)? Second, to what degree can we measure these abilities over time, for the purpose of determining improvement or decline? These fundamental issues are very closely tied to test reliabilities. The purpose of this article is to discuss some of the practical clinical implications of WAIS- III/WMS-III scale reliabilities. In Section 1, a review of the subtest and index score internal consistency and test±retest reliabilities is provided. Detailed information for interpreting change on the IQ and Index scores in clinical samples is contained in Section 2. 1. Internal consistency and test±retest reliability There are numerous reliability tables in the WAIS/WMS Technical Manual (Psychological Corporation, 1997). These tables were studied, and some of the most relevant information for graduate students and busy clinicians were distilled. First and foremost, the WAIS-III IQ and Index scores are consistently the most reliable scores, in terms of both internal consistency and test±retest reliability. The most internally consistent WAIS-III subtests are Vocabulary, Information, Digit Span, Matrix Reasoning, and Arithmetic. Similarities and Block Design also have quite high internal consistency. Information has the highest test±retest reliability, followed by Vocabulary. On the WMS-III, the Auditory Immediate, Immediate Memory, and General Memory Indexes are the most internally consistent index scores. The Verbal Paired Associates I and the Logical Memory I are the most internally consistent of the primary subtest scores. The uncorrected test±retest reliabilities of the primary subtest scores, with the exception of Verbal Paired Associates I in older adults, all range from 0.58 to 0.79. The following index scores have uncorrected test±retest reliabilities greater than 0.80: Auditory Immediate, Immediate Memory, Auditory Delayed Memory, and General Memory. The test±retest reliability of the Working Memory Index in older adults is 0.80. The Index and subtest scores were sorted into two groups, ``most reliable'' and ``least reliable''. To be classified as ``most reliable,'' the score had to have adequate internal consistency (0.85±0.99) and adequate test±retest reliability (0.75±0.99). To be classified as ``least reliable,'' the score had to have low internal consistency (<0.80) and low test±retest reliability (<0.70), or a test retest reliability coefficient below 0.60. The classification results are presented in Table 1.

G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 185 The information presented in the preceding text and in Table 1 has four clear implications for day-to-day clinical practice. The WAIS-III Index scores and four of the WMS-III Indexes (Auditory Immediate, Immediate Memory, Auditory Delayed, and General Memory) are the most reliable, so the clinician should have the greatest confidence in the precision of these scores. Second, if the psychologist wanted to choose certain subtests on the WAIS-III to describe individually in a report, he or she may decide to select those with high internal consistency, such as Vocabulary, Similarities, Information, Digit Span, Block Design, and Matrix Reasoning (or those from the ``most reliable'' category). In contrast, the primary subtest scores on the WMS-III do not have high reliability, with the exception of Logical Memory I and Verbal Paired Associates I. Therefore, the clinician should have less confidence in discussing these individual subtest results. Some of the WMS-III supplementary scores have high ``decision-consistency'' reliabilities, such as Mental Control, Information and Orientation, Faces Percent Retention, Family Pictures Percent Retention, Verbal Paired Associates II Percent Retention (in subjects aged 16±54), Word Lists II Percent Retention (in the elderly), and Visual Reproduction Discrimination. The decision-consistency reliability indicates the consistency of scaled score agreements between test and retest. These reliabilities can increase the clinician's confidence in reporting these individual scores. Third, the clinician should have the least confidence in reporting individual test scores if they fall in the ``least reliable'' category. Finally, it is important to realize that these reliability estimates are useful for determining whether a person has improved, remained the same, or deteriorated on follow-up testing. 2. Interpreting change on the WAIS-III/WMS-III Data provided in the Technical Manual (Psychological Corporation, 1997) can be used to interpret change in several clinical samples on the WAIS-III/WMS-III. Some clinicians use clinical judgment to interpret change, whereas others use psychometric data, such as standard errors of measurement (SEMs), combined with clinical judgement. In general, incorporating SEM information for the purpose of estimating change is preferred over clinical judgment alone. However, a limitation of SEMs is that they are used to provide a confidence band around a score at a single point in time. That is, they are most useful for interpreting single test scores. The standard error of the difference (i.e., S diff ) is more appropriate for creating a confidence band relevant to two scores. The S diff formula includes the SEM from time 1 and time 2. Therefore, it is the standard error of difference that provides the clinician with an estimate of possible measurement error relating to test±retest scores. The purpose of this section is to provide the clinician with tables to assist with determinations of improvement or decline on the WAIS-III and WMS-III in three clinical samples. 3. Reliable change A reliable change methodology (e.g., Jacobson & Truax, 1991; Chelune et al., 1993) can be used for assessing whether a retest change in a given variable is reliable and meaningful.

186 G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 Table 1 WAIS-III/WMS-III reliabilities Adequate internal consistency (0.85 ±0.99) WAIS-III ages 30±54: WAIS-III ages 55±74: WAIS-III ages 75±89: WMS-III ages 16±54: WMS-III ages 55±89: VC, AR, DS, IN, BD, MR, VCI, POI, WMI, PSI VC, SI, AR, DS, IN BD, MR, VCI, POI, WMI, PSI VC, SI, AR, DS, IN, CO, PC, BD, MR, VCI, POI, WMI, PSI VC, SI, AR, DS, IN, LN, MR, VCI, POI, WMI, PSI LM I, VPA I, AII, IMI, ADI, GMI, WMI LM I, VPA I, AII, IMI, ADI, GMI, WMI Low internal consistency ( < 0.80) WAIS-III ages 16 ±29: LN, PA, OA WAIS-III ages 30 ±54: PA, OA WAIS-III ages 55 ±74: LN, WAIS-III ages 75 ±89: BD, PA, OA WMS-III ages 16±54: Faces I, LN, LM II, ARDTS, ARDI, LMTH I, WL I, VR I, SPSF, SPSB, LMTH II, WL II Recall, WL II Recog., VR II, VR II Recog., Copy WMS-III ages 55±89: Faces I, FP I, SPS, Faces II, ARDTS, ARDI, LMTH I, SPSF, SPSB, LMTH II, Copy Adequate test± retest reliability (0.75± 0.99) WAIS-III ages 30±54: WAIS-III ages 55±74: WAIS-III ages 75±89: WMS-III ages 16±54: WMS-III ages 55±89: Low test± retest reliability ( < 0.70) WAIS-III ages 30 ±54: WAIS-III ages 55 ±74: WAIS-III ages 75 ±89: WMS-III ages 16±54: WMS-III ages 55± 89: Most reliable test scores WAIS-III ages 30±54: WAIS-III ages 55±74: WAIS-III ages 75±89: WMS-III ages 16±54: WMS-III ages 55±89: Least reliable test scores VC, AR, DS, IN, CD, BD, VCI, POI, WMI, PSI VC, SI, AR, DS, IN, CO, PC, CD, BD, SS, OA, VCI, POI, WMI, PSI VC, SI, AR, DS, IN, CO, LN, PC, CD, BD, MR, SS, OA, VCI, POI, WMI, PSI VC, SI, AR, IN, CO, PC, CD, BD, SS, VCI, POI, WMI, PSI LM I, VPA I, LM II, AII, IMI, ADI, GMI LM I, VPA I, LN, VPA II, AII, VII, IMI, ADI, VDI, GMI, WMI CO, LN, PA, SS, OA MR PA DS, OA Faces I, FP I, LN, SPS, Faces II, FP II, ARDTS, ARDI, Faces I, SPS, Faces II, VC, AR, DS, IN, CD, BD, VCI, POI, WMI, PSI VC, SI, AR, DS, IN, CD, BD, SS, VCI, POI, WMI, PSI VC, SI, AR, DS, IN, CO, PC, CD, BD, MR, SS, VCI, POI, WMI, PSI VC, SI, AR, IN, CD, SS, VCI, POI, WMI, PSI LM I, VPA I, AII, IMI, ADI, GMI LM I, VPA I, AII, IMI, ADI, GMI, WMI LN, PA, SS, OA

G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 187 Table 1 (continued) Least reliable test scores LN, PA, SS, OA WAIS-III ages 30 ±54: None WAIS-III ages 55 ±74: PA WAIS-III ages 75 ±89: DS, OA WMS-III ages 16±54: Faces I, LN, Faces II, ARDTS, ARDI, LMTH I, WL I, VR I, SPSF, SPSB, LMTH II, WL II, WLRecog, VR II, VRRecog, Copy WMS-III ages 55± 89: Faces I, SPS, Faces II, LMTH, SPSF, SPSB, LMTH II, Copy Criteria for ``most reliable'': adequate internal consistency (0.85± 0.99) and adequate test± retest reliability (0.75±0.99). Criteria for ``least reliable'': low internal consistency (<0.80) and low test±retest reliability (<0.70), or a test± retest reliability coefficient below 0.60. The majority of the manual-based age groups contained within these larger age bands (e.g., 2/3 or 3/4) must have met criteria to be included in each category. Vocabulary = VC, Similarities = SI, Arithmetic = AR, Digit Span = DS, Information = IN, Comprehension = CO, Letter±Number Sequencing = LN, Picture Completion = PC, Coding = CD, Block Design = BD, Matrix Reasoning = MR, Picture Arrangement = PA, Symbol Search = SS, Object Assembly = OA, Verbal Comprehension Index = VCI, Perceptual Organization Index = POI, Working Memory Index = WMI, Processing Speed Index = PSI. Vocabulary, Arithmetic, and all WAIS-III Index and IQ scores meet criteria for ``most reliable'' for all age groups. Coding and Symbol Search meet criteria on the basis of their test± retest reliabilities only, since internal consistency coefficients are not calculated for these measures. This method provides an estimate of the probability that a given difference score would not be obtained by chance; that is, the score would not be due to measurement error. Essentially, a confidence interval can be formed around a score that reflects the reliability of the test. The primary measure of interest is the S diff. The S diff is derived from the SEM, which in turn is derived from the test±retest reliability of the instrument (r xx ) and the standard deviation (SD) of the population of interest. The confidence band is formed by multiplying the S diff by a value from the z-distribution. Multiplying by a value of 1.64, for example, results in a change score in either direction that would be unlikely to occur by chance ( p<0.05 in each tail). The formulas for calculating reliable change are presented in Table 2. 4. Reliable change in three clinical groups Data from three clinical groups (i.e., Alzheimer's disease, chronic alcohol abuse, and schizophrenia) were extracted form the WAIS/WMS Technical Manual (Psychological Corporation, 1997) for the purpose of calculating reliable change estimates. The SEMs and S diff 's for all the IQ and Index scores are presented in Table 3. Rounding to two decimal places occurred at each step in calculating these values. Since none of the clinical groups were tested twice, the test±retest correlations from the normal subjects in the standardization sample were used to calculate the SEMs. Similarly, no retest SEM could be calculated, so the formula for the estimated S diff was used (see Table 2).

188 G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 Table 2 Formulas for calculating reliable change SEM p SEM ˆ SD 1 r SD = standard deviation of the comparison sample r = test ±retest reliability of the comparison sample Standard q error of the difference (S diff ) S diff ˆ SEM 2 1 SEM2 2 q Estimated S diff ˆ 2 SEM 2 1 a a This formula often has been used in the literature on reliable change. Technically, it is incorrect; it represents an ``estimated'' standard error of difference because the SEM for time 1 is weighed instead of using the SEM for time 2. It should be obvious from the formulas in Table 2 that the size of the S diff is related to the SD and the test±retest coefficient. Therefore, larger SDs and smaller correlations will result in larger SEMs and S diff 's. The Auditory Immediate Index, for example, has an S diff of 6.07 for Table 3 SEM and standard error of difference scores on the WAIS-III/WMS-III Alzheimer's Chronic alcohol abuse Schizophrenia SEM S diff SEM S diff SEM S diff WAIS-III Verbal IQ 2.23 3.15 2.79 3.94 3.34 4.72 Performance IQ 3.96 5.60 5.08 7.18 4.52 6.39 Full Scale IQ 2.62 3.70 2.70 3.82 3.55 5.02 Verbal Comprehension 2.40 3.39 2.51 3.55 3.61 5.10 Perceptual Organization 4.09 5.78 5.18 7.33 5.14 7.27 Working Memory 5.54 7.83 3.94 5.57 4.83 6.83 Processing Speed 4.75 6.72 4.50 6.36 4.25 6.01 WMS-III Auditory Immediate 4.29 6.07 6.16 8.71 6.24 8.82 Visual Immediate 5.34 7.55 7.40 10.47 7.29 10.31 Immediate Memory 4.56 6.45 6.60 9.33 6.28 8.88 Auditory Delayed 3.84 5.43 6.51 9.21 6.68 9.45 Visual Delayed 3.97 5.61 6.89 9.74 7.55 10.68 Auditory Recognition 4.64 6.56 8.19 11.58 9.39 13.28 General Memory 3.12 4.41 5.11 7.23 5.69 8.05 Working Memory 7.61 10.76 5.12 7.24 7.65 10.82 The number of patients per group was as follows: Alzheimer's = 35, chronic alcohol abuse = 28, and schizophrenia = 42. The following uncorrected stability coefficients were used to calculate the SEMs: Alzheimer's group Ð WAIS- III age group 55±74 and WMS-III age group 55±89; chronic alcohol abuse Ð WAIS-III age group 30±54 and WMS-III age group 16±54; and schizophrenia Ð WAIS-III age group 30±54 and WMS-III age group 16±54.

G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 189 Table 4 Confidence intervals for measurement error in three clinical groups Alzheimer's Chronic alcohol abuse Schizophrenia 0.80 0.90 0.80 0.90 0.80 0.90 WAIS-III Verbal IQ 4.03 5.17 5.04 6.46 6.04 7.74 Performance IQ 7.17 9.18 9.19 11.78 8.18 10.48 Full Scale IQ 4.74 6.07 4.89 6.26 6.43 8.23 Verbal Comprehension 4.34 5.56 4.44 5.82 6.53 8.36 Perceptual Organization 7.40 9.48 6.05 12.02 9.31 11.92 Working Memory 10.02 12.84 7.13 9.13 8.74 11.20 Processing Speed 8.60 11.02 8.14 10.43 7.69 9.86 WMS-III Auditory Immediate 7.77 9.95 11.15 14.28 11.29 14.46 Visual Immediate 9.66 12.38 13.40 17.17 13.20 16.91 Immediate Memory 8.26 10.58 11.94 15.30 11.37 14.56 Auditory Delayed 6.95 8.91 11.79 15.10 12.10 14.50 Visual Delayed 7.18 9.20 12.47 15.97 13.67 17.52 Auditory Recog. Delayed 8.40 10.76 14.82 18.99 17.00 21.78 General Memory 5.64 7.23 9.25 11.86 10.30 13.20 Working Memory 13.77 17.65 9.27 11.87 13.85 17.25 Alzheimer's patients and 8.82 for persons with schizophrenia (Table 3). The 90% confidence band for these two groups is 10 points and 15 points, respectively. This is because the sample SD for the patients with Alzheimer's was only 11.0, compared to 15.6 for persons with schizophrenia. Less variability in the sample resulted in lower reliable change scores. The following three examples illustrate the clinical use of Table 4. First, a 67-year-old man with suspected mild Alzheimer's disease is tested twice with the WMS-III at a 14-month interval. His Auditory Delayed Index score dropped from 81 to 73. By looking at Table 4, the clinician can conclude that this eight point decline is not due to measurement error (0.80 confidence, two-tailed). Second, a 42-year-old woman with chronic alcoholism completes the WMS-III after 1 month and 12 months of abstinence. Her General Memory Index improves from 80 to 88. The clinician cannot be confident that her change in performance is ``real''; that is, it is not due to measurement error. Third, a 37-year-old man with schizophrenia completes the WAIS-III shortly after admission to a psychiatric facility. His Working Memory and Processing Speed Index scores were 81 and 77, respectively. His psychiatric condition is stabilized and 12 months post discharge (15-month test±retest interval), he obtains a Working Memory Index score of 91 and a Processing Speed Index score of 86. The clinician can be confident that these 10- and 9-point changes are not due to measurement error (0.80 confidence interval, two-tailed). 1 The purpose of Table 4 1 It is entirely possible that future research will alter the interpretation of change in this example. Specifically, this patient may be showing some influence of practice and regression to the mean on his retest scores. As this information becomes available, the values in Table 4 can be adjusted accordingly.

190 G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 is to provide preliminary psychometric data that can be used to assist the psychologist's clinical decision regarding whether a specific patient has improved, declined, or remained stable on follow-up testing. These data are meant to supplement, rather than replace, clinical judgement. 5. Discussion Interpreting change on psychological tests is a careful and deliberate process that is intricately related to the psychometric properties of a given test for a given population. The data provided in these tables are a crude estimate of the reliable change difference scores for the WAIS-III and WMS-III. There are six factors that limit the accuracy and usefulness of the data provided in Table 4. First, the sample sizes are very small. Second, the clinical subjects were not tested twice. Therefore, there were no reliability coefficients for calculating the SEMs. Instead, the reliability coefficients for the normal population with the SDs from the clinical populations were used to calculate these SEMs. These test±retest correlations may be considerably higher than those that would be obtained from clinical samples. Third, since the subjects were not tested twice, the SEM from time 1 was used in the formula for the S diff twice. This should be considered an ``estimated S diff,'' not the true S diff (Iverson, 1998). Fourth, the values in Table 4 were not corrected for practice effects because true practice effects for these clinical groups, at a reasonable interval (such as 1 year), are unknown. The clinician must use his or her judgment in this matter. Fifth, there is no accounting for possible regression to the mean in these values. If it was known that retest scores in certain clinical groups were influenced by regression to the mean (Speer, 1992), the low initial scores would require greater change and the high initial scores would require lesser change to be considered reliable. Sixth, the values in Table 4 are based on obtained scores, not estimated true scores. The confidence intervals would be more accurate if they were applied to estimated true scores. The aforementioned problems could not be overcome, given the limitations of applying this method to the data presented in the Technical Manual. Most of these problems can be addressed only if the raw data from both test and retest is examined. The clinician can accommodate some of these problems with future research data. For example, if it becomes clear that there is a reliable and consistent practice effect of three points on the Auditory Immediate Index of the WMS-III in persons with schizophrenia who are retested at a 1-year interval, then the clinician can add these three points to the values presented in Table 4. Or, more preferably, tables giving the distribution of test±retest change scores could be consulted. References Chelune, G. J., Naugle, R. I., Luders, H., Sedlak, J., & Awad, I. A. (1993). Individual change after epilepsy surgery: practice effects and base-rate information. Neuropsychology 7, 41 ±52. Iverson, G. L. (1998). Interpretation of Mini-Mental State Examination scores in community-dwelling elderly and geriatric neuropsychiatry patients. Int J Geriatr Psychiatry 13, 661±666.

G.L. Iverson / Archives of Clinical Neuropsychology 16 (2001) 183±191 191 Jacobson, N. S., & Truax, P. (1991). Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol 59, 12 ±19. Psychological Corporation (1997). WAIS-III/WMS-III Technical Manual. San Antonio: Harcourt Brace. Speer, D. C. (1992). Clinically significant change: Jacobson and Truax (1991) revisited. J Consult Clin Psychol 60, 402±408.