The Repeatable Battery for the Assessment of Neuropsychological Status Effort Scale

Archives of Clinical Neuropsychology 27 (2012) 190 195 The Repeatable Battery for the Assessment of Neuropsychological Status Effort Scale Julia Novitski 1,2, Shelly Steele 2, Stella Karantzoulis 3, Christopher Randolph 2,4, * 1 Department of Psychology, Rosalind Franklin University of Medicine and Science, North Chicago, IL, USA 2 Chicago Neuropsychological Services, Chicago, IL, USA 3 Department of Neurology, New York University Medical Center, New York, NY, USA 4 Department of Neurology, Loyola University Medical Center, Maywood, IL, USA *Corresponding author at: Chicago Neuropsychological Services, 1 East Erie, Suite 353, Chicago, IL 60611, USA. Tel.: +1-708-216-3539; fax: +1-312-262-4880. E-mail address: crandol@lumc.edu (C. Randolph). Accepted 28 December 2011 Abstract The measurement of effort is now considered to be an important component of neuropsychological assessment. In addition to stand-alone measures, built-in, or embedded measures of effort have been derived for a limited number of standard neurocognitive tests. The Repeatable Battery for the Assessment of Neuropsychological Status (RBANS) is a widely used brief battery, employed as a core diagnostic tool in dementia and as a neurocognitive screening battery or tracking/outcome measure in a variety of other disorders. An effort index (EI) for the RBANS has been published previously (Silverberg, N. D., Wertheimer, J. C., & Fichtenberg, N. L. 2007. An EI for the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS). Clinical Neuropsychology, 21 (5), 841 854), but it has been reported to result in high false positive rates when applied to patients with true amnesia (e.g., Alzheimer s disease). We created a new effort scale (ES) for the RBANS based on the observation of patterns of free recall and recognition performance in amnesia versus inadequate effort. The RBANS ES was validated on a sample of patients with amnestic disorders and a sample of mild traumatic brain injury participants who failed a separate measure of effort. The sensitivity and specificity of the new ES was compared with the previously published EI. Receiver-operating characteristic analyses demonstrated much better sensitivity and specificity of the ES, with a marked reduction in false positive errors. Application and limitations of the RBANS ES, including indications for its use, are discussed. Keywords: Malingering/symptom validity testing; Assessment; Test construction Introduction Effort testing has become recognized as an integral part of neuropsychological evaluations (Bush et al., 2005). An assessment of effort helps identify individuals who may be feigning or exaggerating impairments for the purpose of some type of gain, as well as individuals who are performing below their true ability level for other factors (e.g., somatoform disorders, inadequate investment in the testing process, etc.). In either case, evidence of inadequate effort during testing essentially invalidates the utility of the neurocognitive test results in measuring impairment. The only solid conclusion that can be drawn from the results of standardized tests given during an examination within which effort tests were failed is that the patient was capable of performing at least as well as the observed performance. Although stand-alone measures of effort are widely utilized in neuropsychological evaluations, they are time-consuming and usually do not provide any additional independent information on neurocognitive status. Another approach to assessing effort involves the use of measures that are embedded in existing standardized tests. As with virtually all measures of effort that are based on cognitive tasks, embedded measures typically involve the use of subtests that appear to a layperson to be sensitive to brain injury/disease, but in fact are relatively resistant to central nervous system dysfunction (see Bianchini, # The Author 2012. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. doi:10.1093/arclin/acr119 Advance Access publication on 25 January 2012

J. Novitski et al. / Archives of Clinical Neuropsychology 27 (2012) 190 195 191 Mathias, & Greve, 2001, for a review). Evidence of disproportionally poor performance on such measures can be used as an index of inadequate effort. Two commonly used paradigms to explore effort are digit span and recognition memory. Poor performance on digit span testing has been validated as an index of poor effort in a number of studies (Axelrod, Fichtenberg, Millis, & Wertheimer, 2006; Bernard, 1990; Greiffenstein, Baker, & Gola, 1994; Mathias, Greve, Bianchini, Houston, & Crouch, 2002). Forced-choice recognition testing has also been validated as sensitive to inadequate effort and has been used in stand-alone measures of effort such as the Word Memory Test (WMT; Green, 2003) and the Test of Memory Malingering (TOMM; Tombaugh, 1996). Embedded measures of effort have also been derived from recognition testing using the California Verbal Learning Test (Delis, Kramer, Kaplan, & Ober, 1987; Wolfe et al., 2010) and the Wechsler Memory Scale, versions 3 and 4 (Iverson, Franzen, & McCracken, 1994; Wechsler, 1997, 2009). Silverberg, Wertheimer, and Fichtenverg (2007) relied upon this literature in constructing an effort index (EI) for the RBANS (Randolph, 1998). The RBANS is a widely used brief neurocognitive battery that is composed of 12 subtests, including a Digit Span subtest (forward digits only) and a yes/no List Recognition subtest. The EI was calculated by creating a composite, weighted score from the performance on the Digit Span and List Recognition subtests of the RBANS. They examined the distributions of performance of patients with various types of neurological disorders referred for outpatient neuropsychological evaluations and created cutoff scores for suspicious effort on the basis of those distributions. Importantly, individuals with dementia were largely excluded from this sample, as they were viewed as inappropriate for symptom validity testing. The newly created EI was then validated on five separate samples, including a sample of patients with mild moderate traumatic brain injury (TBI) who passed effort testing, a sample of patients with minimal head trauma who failed independent effort testing, and a group of simulated malingerers. The EI demonstrated good sensitivity and specificity in terms of classification rates and was proposed for use as an embedded measure of effort for the RBANS. Subsequent studies of the Silverberg and colleagues EI in patients with dementia, however, have reported fairly high false positive rates in that population. Hook, Marquine, and Hoelzle (2009) calculated the RBANS EI scores for a sample of 44 clinically referred, non-litigating patients aged 60 and older. The mean Mini-Mental State Examination (Folstein, Folstein, & McHugh, 1975) scores for this group was 25 with a range of 16 30, and a mean RBANS Total Scale Index score was 68. In this sample, 31% of the sample obtained an EI above the suggested cutoff, and overall cognitive ability was correlated with EI scores. This finding suggests that patients with true amnesia, most likely secondary to dementia in this sample, would potentially be classified as putting forth inadequate effort. Barker, Horner, and Bachman (2010) subsequently examined the utility of the RBANS EI in detecting inadequate effort in a geriatric sample, where probable good and suspect effort were independently established via the use of the TOMM and clinical consensus from a multidisciplinary team. The RBANS EI demonstrated modest sensitivity using this approach, and the authors cautioned regarding the use of the EI in patients with mild dementia, and against using it as an embedded measure of effort in patients with moderate to severe dementia. A recent study, Armistead-Jehle and Hansen (2011), examined the base rates of poor effort in a military sample and compared the performance on the RBANS EI to stand-alone effort measures. A total of 85 active military members were evaluated with a neuropsychological battery, which included the TOMM, the Medical Symptom Validity Test (MSVT; Green, 2004), a non-verbal version of the MSVT (NV-MSVT; Green, 2008), and the RBANS. Participants included those with a history of mild TBI (mtbi)/concussion and mental health conditions such as depression and post-traumatic stress disorder. Failure rates were 20% for the MSVT, 15% for the NV-MSVT, and 11% for the TOMM. The failure rate on the RBANS EI was 14% or 7%, depending upon the cutoff used. The authors concluded that the RBANS EI demonstrated high specificity for poor effort, but only modest sensitivity, and that the use of an additional measure of effort would be warranted to avoid false negatives. The present study was undertaken to determine if a different embedded effort scale (ES) for the RBANS could be constructed with respect to the discrimination between inadequate effort and true amnesia. Based on our clinical observations and extensive experience with the use of the RBANS in Alzheimer s disease and other amnestic disorders, we hypothesized that in true amnesia, free recall performance would be likely to decline to zero, or fairly close to zero, before recognition started to fall away from the performance ceiling (Hodges, Salmon, & Butters, 1993; Welsh, Butters, Hughes, Mohs, & Heyman, 1991; Welsh, Butters, Hughes, Mohs, & Heyman, 1992). In addition, simple working memory as measured by the Digit Span subtest would still be expected to remain relatively intact in this population. We therefore decided to develop a revised ES based on a recognition measure that was adjusted relative to free recall performance, such that disproportionately poor performance on recognition testing relative to recall would be combined with raw scores on the Digit Span such that lower combined scores would be expected to reflect poor effort. The newly created RBANS ES is derived from the following calculation: RBANS ES =(List Recognition (List Recall + Story Recall + Figure Recall))+Digit Span

192 J. Novitski et al. / Archives of Clinical Neuropsychology 27 (2012) 190 195 It is important to note that we consider this to be applicable only in cases where there appears to be evidence of poor performance on List Recognition and/or Digit Span subtests. The reason for this caveat is that in the normal population, a total free recall raw score (the sum of List Recall, Story Recall, and Figure Recall) on the RBANS is typically much higher than the List Recognition raw score, which is constrained by a ceiling effect. This would eventuate in a high false positive rate among cognitively intact individuals. To guide decision-making in this regard, the frequency distributions for the List Recognition and Digit Span subtests of the RBANS in the standardization sample were examined. We hypothesized that the new ES would be more effective in distinguishing between inadequate effort and true amnesia. Methods Participants Existing data on the RBANS for a group of 25 mtbi patients (mean age ¼ 49.00, SD ¼ 10.11) and 69 clinical subjects with a diagnosis of amnestic MCI (N ¼ 15) or probable Alzheimer s disease (N ¼ 54), with an overall mean age of 80.61 (SD ¼ 6.33), were used in the analyses. The mean RBANS Total Scale score for the mtbi group was 71.92 (SD ¼ 17.24), and the mean RBANS Total Scaled score for the amnestic sample was 64.58 (SD ¼ 12.89). Patients were referred for clinical neuropsychological evaluations to two urban area, academically affiliated neuropsychology practices. Diagnostic criteria employed in the diagnosis of mtbi included: (a) an alteration of mental status as a result of a blow to the head, (b) loss of consciousness of,5 min (most cases had no loss of consciousness), and (c) no evidence of brain damage on neuroimaging. All mtbi patients were tested no earlier than 6 months following head injury. The mtbi patients performed below the standard cutoff scores on a free-standing measure of effort (Green s WMT; Green, 2003), suggesting questionable effort. The WMT was not administered to any of the patients in the amnestic sample, but there was no clinical indication of possible somatoform tendencies or secondary gain for any of these individuals based on the clinical judgment of an experienced clinical neuropsychologist (CR). Data from the RBANS standardization sample (N ¼ 540) were analyzed to determine the frequency distributions of performance on the Digit Span and List Recognition subtests in the normal population. Analyses and Results Frequency Distributions of Digit Span and List Recognition Scores As mentioned previously, frequency distributions for the List Recognition and Digit Span subtests of the RBANS in the standardization sample were examined in order to aid in decision-making of appropriate cutoff scores related to questionable effort (Table 1). In the RBANS standardization sample, raw scores of,9 on the Digit Span subtest were observed in only 23% of the sample, and raw scores of,8 were observed in only 7% of individuals. Raw scores of,9 were observed in 36% of the amnestic sample and raw scores of,8 were observed in 19% of the sample. For the mtbi sample, raw scores of,9 were observed in 52% of the sample and raw scores of,8 in 32% of the sample. For List Recognition, only 14% of the normal standardization sample obtained scores of 18, and only 7% had a score of 17. In the amnestic sample, 90% obtained scores of 18, and 83% had scores of 17. In the mtbi sample, 72% had scores of 18, and 56% had scores of 17. These results suggest that the mtbi sample performed somewhat worse than the amnestic sample on Digit Span, but somewhat better on List Recognition. A composite score was made by summing Digit Span and List Recognition subtests. Only 17% of the normal standardization sample obtained a score of 27, whereas 78% of the combined amnestic and mtbi sample obtained scores in this range. This suggests that inadequate effort and/or impairment on the RBANS should be suspected with Digit Span scores of,9, Recognition scores of,19, or combined Digit Span + Recognition scores of,28. For the RBANS standardization sample, only 15.1% of subjects had a combined Digit Span + List Recognition score of,28 and an ES score of,12. Table 1. Frequency distributions of Digit Span and List Recognition scores Amnestic sample (%) mtbi (%) Standardization sample (%) Digit Span,9 36 52 23 Digit Span,8 19 32 7 List Recognition 18 90 72 14 List Recognition 17 83 56 7

Receiver-Operating Characteristics Curves The new RBANS ES was calculated as described above, and a cutoff score was established on the basis of a review of the distributions. All statistics were performed using SPSS software. Scores of,12 on the ES were considered to be reflective of inadequate effort. The Silverberg EI was calculated using the procedures outlined by Silverberg and colleagues (2007). The ES and EI were then subjected to receiver-operating characteristic analyses to examine the relative efficacy of each measure in separating the mtbi and amnestic samples. The Silverberg EI produced an area under the curve (AUC) of 0.608, whereas the RBANS ES performed substantially better, with an AUC of 0.908 (Fig. 1). Frequency distributions were calculated based on the respected procedures for the EI and RBANS ES, and results were plotted for the amnestic and mtbi samples (Fig. 2). The new RBANS ES demonstrated much better discriminability between the two participant groups when compared with the Silverberg and colleagues EI. Discussion J. Novitski et al. / Archives of Clinical Neuropsychology 27 (2012) 190 195 193 The use of embedded indices of effort has the advantage of providing relevant information within the context of a routine neuropsychological evaluation and can augment stand-alone measures of effort to provide a more thorough screen of this important component in the interpretation of neurocognitive test data. An embedded EI was derived for the RBANS by Silverberg and colleagues (2007), and this has been demonstrated to be useful in discriminating poor effort from adequate effort in certain clinical populations. The Silverberg EI has been criticized, however, for a high false positive rate in clinical populations characterized by true amnesia (e.g., Alzheimer-type dementia; Barker et al., 2010; Hook et al., 2009). We created a new ES for the RBANS by taking advantage of the observation that free recall performance typically declines to near-floor levels in amnestic disorders before recognition performance on this scale begins to fall away from the performance ceiling and that this pattern is typically not observed in patients with presumably intact memory who are exhibiting evidence of poor effort on testing. This newly derived ES was compared with the Silverberg EI in discriminating between a sample of mtbi patients who exhibited evidence of poor effort based on an independent measure (WMT) and a sample of patient with significant amnestic impairment (MCI and probable Alzheimer s disease). The new ES demonstrated markedly improved discriminability, and the distributions of scores on the new ES clearly demonstrated much less overlap between groups. Therefore, when the clinical issue is distinguishing between what may be a true amnestic disorder and poor effort, the new ES would appear to be a superior measure. Considering that effort testing is considered an integral part of neuropsychological evaluations, an embedded ES would be especially helpful in working with older adults, for whom shorter batteries are typically utilized. It is important to note, however, that the new Fig. 1. Receiver-operating characteristic curves for the RBANS EI and RBANS ES.

194 J. Novitski et al. / Archives of Clinical Neuropsychology 27 (2012) 190 195 Fig. 2. Frequency distributions for the RBANS EI and RBANS ES. ES will produce high false positive rates in the normal population and that the application of the ES should be limited to cases where there is evidence of possible impairment or poor effort on List Recognition and/or Digit Span subtests. In reviewing data from the RBANS standardization sample, List Recognition scores of,19 or Digit Span scores of,9 occur at relatively low rates, and scores below these levels should be considered justification for calculating the ES. Less than 17% of the normal standardization sample had a combined (List Recognition + Digit Span) score of,28, so this would be a reasonable initial approach to screening for possible poor effort that might warrant calculating the ES. One of the limitations of this study is the lack of effort testing in the amnestic sample. Although there was no clinical indication of possible secondary gain or somatoform tendencies in these participants, scores on independent effort measures would have provided further support. Another limitation is the use of retrospective data. Future validation studies should aim to utilize stand-alone effort measures to verify effort in all research samples. Additionally, it may be useful for future studies to compare older adult groups with true amnesia when compared with those that failed stand-alone effort measures. Additional validation work is necessary to more firmly establish the clinical utility of the newly derived RBANS ES, and, as with any measure of effort, the ES should be considered in the context of clinical history, presentation, and pattern of performance across other measures. Specifically, it would be helpful to validate the ES in conjunction with stand-alone measures of effort. Once the calculation of an ES score is triggered by unusually low performance on one or more of these two subtests, ES scores,12 should be considered suspicious for suggesting poor effort. Additional measures of effort should be examined under most circumstances in order to clarify the finding. References Armistead-Jehle, P., & Hansen, C. L. (2011). Comparison of the Repeatable Battery for the Assessment of Neuropsychological Status Effort Index and Stand-alone Symptom Validity Tests in a military sample. Archives of Clinical Neuropsychology, 26 (7), 592 601. Axelrod, B. N., Fichtenberg, N. L., Millis, S. R., & Wertheimer, J. C. (2006). Detecting incomplete effort with Digit Span from the Wechsler Adult Intelligence Scale-Third Edition. Clinical Neuropsychology, 20 (3), 513 523.

J. Novitski et al. / Archives of Clinical Neuropsychology 27 (2012) 190 195 195 Barker, M. D., Horner, M. D., & Bachman, D. L. (2010). Embedded indices of effort in the repeatable battery for the assessment of neuropsychological status (RBANS) in a geriatric sample. Clinical Neuropsychology, 24 (6), 1064 1077. Bernard, L. C. (1990). Prospects for faking believable memory deficits on neuropsychological tests and the use of incentives in simulation research. Journal of Clinical and Experimental Neuropsychology, 12 (5), 715 728. Bianchini, K. J., Mathias, C. W., & Greve, K. W. (2001). Symptom Validity Testing: A critical review. The Clinical Neuropsychologist, 15 (1), 19 45. Bush, S. S., Ruff, R. M., Troster, A. I., Barth, J. T., Koffler, S. P., Pliskin, N. H., et al. (2005). Symptom validity assessment: Practice issues and medical necessity NAN policy & planning committee. Archives of Clinical Neuropsychology, 20 (4), 419 426. Delis, D. C., Kramer, J. H., Kaplan, E., & Ober, B. A. (1987). The California Verbal Learning Test. San Antonio, TX: The Psychological Corporation. Folstein, M. F., Folstein, S. E., & McHugh, P. R. (1975). Mini-mental state. A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research, 12 (3), 189 198. Green, P. (2003). Green s Word Memory Test for Microsoft Windows. Edmonton, Canada: Green s Publishing. Green, P. (2004). Green s Medical Symptom Validity Test (MSVT) for Microsoft Windows: User s manual. Edmonton, Canada: Green s Publishing. Green, P. (2008). Manual for the Nonverbal Medical Symptom Validity Test. Edmonton, Canada: Green s Publishing. Greiffenstein, M. F., Baker, W. J., & Gola, T. (1994). Validation of malingered amnesia measures with a large clinical sample. Psychological Assessment, 6 (3), 218 240. Hodges, J. R., Salmon, D. P., & Butters, N. (1993). Recognition and naming of famous faces in Alzheimer s disease: A cognitive analysis. Neuropsychologia, 31 (8), 775 788. Hook, J. N., Marquine, M. J., & Hoelzle, J. B. (2009). Repeatable battery for the assessment of neuropsychological status effort index performance in a medically ill geriatric sample. Archives of Clinical Neuropsychology, 24 (3), 231 235. Iverson, G. L., Franzen, M. D., & McCracken, L. M. (1994). Application of a forced-choice memory procedure designed to detect experimental malingering. Archives of Clinical Neuropsychology, 9 (5), 437 450. Mathias, C. W., Greve, K. W., Bianchini, K. J., Houston, R. J., & Crouch, J. A. (2002). Detecting malingered neurocognitive dysfunction using the reliable digit plan in traumatic brain injury. Assessment, 9 (3), 301 308. Randolph, C. (1998). Repeatable Battery for the Assessment of Neuropsychological Status (RBANS). San Antonio, Harcourt, TX: The Psychological Corporation. Silverberg, N. D., Wertheimer, J. C., & Fichtenberg, N. L. (2007). An effort index for the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS). Clinical Neuropsychology, 21 (5), 841 854. Tombaugh, T. N. (1996). The Test of Memory Malingering. Toronto, Canada: Multi-Health Systems. Wechsler, D. (1997). Wechsler Memory Scale (WMS-III) administration and scoring manual. San Antonio, TX: The Psychological Corporation. Wechsler, D. (2009). Wechsler Memory Scale (WMS-IV) technical and interpretive manual. San Antonio, TX: Pearson. Welsh, K., Butters, N., Hughes, J., Mohs, R., & Heyman, A. (1991). Detection of abnormal memory decline in mild cases of Alzheimer s disease using CERAD neuropsychological measures. Archives of Neurology, 48 (3), 278 281. Welsh, K. A., Butters, N., Hughes, J. P., Mohs, R. C., & Heyman, A. (1992). Detection and staging of dementia in Alzheimer s disease. Use of the neuropsychological measures developed for the Consortium to Establish a Registry for Alzheimer s Disease. Archives of Neurology, 49 (5), 448 452. Wolfe, P. L., Millis, S. R., Hanks, R., Fichtenberg, N., Larrabee, G. L., & Sweet, J. J. (2010). Effort indicators within the California Verbal Learning Test-II (CVLT-II). Clinical Neuropsychology, 24 (1), 153 168.