Cutoff Scores for MMPI-2 and MMPI-2-RF Cognitive-Somatic Validity Scales for Psychometrically Defined Malingering Groups in a Military Sample

Similar documents
Exaggerated MMPI-2 symptom report in personal injury litigants with malingered neurocognitive deficit

The Repeatable Battery for the Assessment of Neuropsychological Status Effort Scale

KEVIN J. BIANCHINI, PH.D., ABPN

A Multi-Method Assessment Approach to the Detection of Malingered Pain: Association with the MMPI-2 Restructured Form

Commentary on Delis and Wetter, Cogniform disorder and cogniform condition: Proposed diagnoses for excessive cognitive symptoms

The Albany Consistency Index for the Test of Memory Malingering

Comparison of Performance of the Test of Memory Malingering and Word Memory Test in a Criminal Forensic Sample

Detection and diagnosis of malingering in electrical injury

Validation of the Symptoms of Post- Concussion Syndrome Questionnaire as a Self-Report Symptom Validity Test: A Simulation Study. Victoria Jayne Reece

Performance profiles and cut-off scores on the Memory Assessment Scales

Characterization of the Medical Symptom Validity Test in evaluation of clinically referred memory disorders clinic patients

THE EFFICIACY OF THE MMPI-2 LEES- HALEY FAKE BAD SCALE (FBS) FOR DIFFERENTIATING NEUROCOGNITIVE AND PSYCHIATRIC FEIGNERS

Utility of the MMPI 2-RF (Restructured Form) Validity Scales in Detecting Malingering in a Criminal Forensic Setting: A Known-Groups Design

The unexamined lie is a lie worth fibbing Neuropsychological malingering and the Word Memory Test

Archives of Clinical Neuropsychology Advance Access published April 27, John H. Denning*

Background 6/24/2014. Validity Testing in Pediatric Populations. Michael Kirkwood, PhD, ABPP/CN. Conflict of Interest Statement

SENSITIVITY OF AN MMPI-2-RF COMBINED RESPONSE INCONSISTENCY (CRIN) SCALE TO MIXED RESPONDING

Malingering Detection among Accommodation- Seeking University Students

Psychological Assessment

Financial Disclosure

Donald A. Davidoff, Ph.D., ABPDC Chief, Neuropsychology Department, McLean Hospital Assistant Professor of Psychology, Harvard Medical School

Influence of poor effort on self-reported symptoms and neurocognitive test performance following mild traumatic brain injury


MMPI-2 short form proposal: CAUTION

AN ESTABLISHMENT OF EMBEDDED SYMPTOM VALIDITY TESTING WITHIN THE DELIS-KAPLAN EXECUTIVE FUNCTIONING SYSTEM. A Dissertation by. Emanuel J.

Malingering (AADEP Position Paper) The gross volitional exaggeration or fabrication of symptoms/dysfunction for the purpose of obtaining substantial m

International Journal of Forensic Psychology Copyright Volume 1, No. 2 SEPTEMBER 2004 pp

Characteristics of Compensable Disability Patients Who Choose to Litigate

Improving Accuracy in the. through the use of Technology

Agenda. The MMPI-2-RF (Restructured Form) Forensic Practice Briefing. Disclosure

The vulnerability to coaching across measures of malingering

Chapter 2 Malingering: Definitional and Conceptual Ambiguities and Prevalence or Base Rates

University of Kentucky Master's Theses

Simulated subaverage performance on the Block Span task of the Stanford-Binet Intelligence Scales- Fifth Edition

Professional Practice Guidelines for the Use and Interpretation of SVTs: The Time Has Come. Jerry Sweet May 20, 2011 London, England

A Comparison of Two BHI Measures of Faking

Malingering detection in a Spanish population with a known-groups design

Malingering Detection Measure Utility and Concordance in a University Accommodation- Seeking Student Population

Factors Influencing the Face Validity of Effort Tests: Timing of Warning and Feedback

The Utility of the NEO PI R Validity Scales to Detect Response Distortion: A Comparison With the MMPI 2

VALIDATION OF THE MILLER FORENSIC ASSESSMENT OF SYMPTOMS TEST (M- FAST) IN A CIVIL FORENSIC POPULATION

Award Number: W81XWH

Criterion validity of the California Verbal Learning Test-Second Edition (CVLT-II) after traumatic brain injury

RBANS Embedded Measures of Suboptimal Effort in Dementia: Effort Scale Has a Lower Failure Rate than the Effort Index

Neuropsychology of TBI & PTSD

WPE. WebPsychEmpiricist

Ecological Validity of the WMS-III Rarely Missed Index in Personal Injury Litigation. Rael T. Lange. Riverview Hospital.

Published online: 25 Aug 2014.

DETECTION OF MALINGERED MENTAL RETARDATION

CVLT-II Forced Choice Recognition Trial as an Embedded Validity Indicator: A Systematic Review of the Evidence

Determining causation of traumatic versus preexisting. conditions. David Fisher, Ph.D., ABPP, LP Chairman of the Board PsyBar, LLC

Effects of severe depression on TOMM performance among disability-seeking outpatients

THE VALIDITY OF THE LETTER MEMORY TEST AS A MEASURE OF MEMORY MALINGERING: ROBUSTNESS TO COACHING. A dissertation presented to.

Passing or Failing of Symptom Validity Tests in Academic Accessibility Populations: Neuropsychological Assessment of Near-Pass Patients

ISSN: (Print) (Online) Journal homepage:

Abstract PERFORMANCE OF THE IMMEDIATE POST-CONCUSSION ASSESSMENT AND

Noncredible Explanations of Noncredible Performance on Symptom Validity Tests

PLEASE SCROLL DOWN FOR ARTICLE

THE MMPI 2 RF. Overview of the MMPI-2-RF. Agenda. Overview of the MMPI-2-RF. Overview of the MMPI-2-RF 9/8/2016. Psychometric Rationale & Properties

The Traumatic Events Inventory: Preliminary Investigation of a New PTSD Questionnaire

Case Description: Mr. F Personnel Screening, Law Enforcement Score Report

Elderly Norms for the Hopkins Verbal Learning Test-Revised*

The effect of distraction on the Word Memory Test and Test of Memory Malingering performance in patients with a severe brain injury

THE WORD READING TEST OF EFFORT IN ADULT LEARNING DISABILITY: A SIMULATION STUDY

USE OF THE MMPI-2-RF IN POLICE & PUBLIC SAFETY ASSESSMENTS

MMPI-2-RF References by Topic June 2018

Experimental study examining observational and objective methods of assessing effort in an undergraduate sample

E. Miriam Schmitt-Monreal

Published online: 12 Dec 2014.

Detecting malingered ADHD using the personality assessment inventory : an exploratory analysis in college students

Medical Symptom Validity Test Performance Following Moderate-Severe Traumatic Brain Injury: Expectations Based on Orientation Log Classification

Impact of Using Raw Versus Uniform T Scores in Minnesota Multiphasic Personality Inventory-2 Restructured Form Descriptive and Inferential Research

Running head: UNDER-REPORTING ON MMPI-2-RF AND COLLATERAL MEASURES 1

SAMPLE. Interpretive Report: Clinical Settings. Yossef S. Ben-Porath, PhD, & Auke Tellegen, PhD TRADE SECRET INFORMATION

Mmpi 2 Test Questions Answers Samian

The assessment of malingering in claims of retrograde and anterograde amnesia.

SAMPLE REPORT. Case Description: Frank Correctional Score Report

Pain, Malingering, and Performance on the WAIS-III Processing Speed Index

Discriminant Function Analysis of Malingerers and Neurological Headache Patients Self- Reports of Neuropsychological Symptoms

Head injury and the ability to feign neuropsychological deficits

Malingering vs Credibility: A call of caution and robust clinical analysis

Minimizing Misdiagnosis: Psychometric Criteria for Possible or Probable Memory Impairment

Attention and Memory Dysfunction in Pain Patients While Controlling for Effort on the California Verbal Learning Test-11

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

MMPI-2-RF References by Topic July 2017

Cognitive-Behavioral Assessment of Depression: Clinical Validation of the Automatic Thoughts Questionnaire

Chapter 23. Detection of Children s Malingering on Raven s Standard Progressive Matrices*

Trends in Psychological/ Psychiatric Injury and Law

Improving the Methodology for Assessing Mild Cognitive Impairment Across the Lifespan

SAMPLE REPORT. Case Description: Julie School Setting Score Report

Case Description: Mr. D Bariatric Surgery Candidate Score Report

A Comparison of MMPI 2 High-Point Coding Strategies

The Use of Significant Others to Enhance the Detection of Malingerers From Traumatically Brain-Injured Patients

Interpretive Report. Client Information

Comparison of Oral and Computerized Versions of the Word Memory Test

Financial Disclosure

Assessment of Suboptimal Effort Using the CVLT- II Recognition Foils: A Known-Groups Comparison

Writing a Good Cookbook: I. A Review of MMPI High-Point Code System Studies

Curriculum Vitae Dustin B. Wygant, Ph.D.

Comparison of Male and Female Response Behaviour on Minnesota Multiphasic Personality Inventory-2

Transcription:

Archives of Clinical Neuropsychology 31 (2016) 786 801 Cutoff Scores for MMPI-2 and MMPI-2-RF Cognitive-Somatic Validity Scales for Psychometrically Defined Malingering Groups in a Military Sample Abstract Alvin Jones* Womack Army Medical Center, NC, USA *Corresponding author at: Department of Behavioral Health, Womack Army Medical Center, 2817 Riley Road, Bldg 4-2817, Ft Bragg, NC 29310-7301. Tel.: 910-978-3202. E-mail address: alrobj@googlemail.com (A. Jones). Accepted 2 May 2016 Objective: This research examined cutoff scores for MMPI-2 and MMPI-2-RF validity scales specifically developed to assess non-credible reporting of cognitive and/or somatic symptoms. The validity scales examined included the Response Bias Scale (RBS), the Symptom Validity Scales (FBS, FBS-r), Infrequent Somatic Responses scale (Fs), and the Henry Heilbronner Indexes (HHI, HHI-r). Method: Cutoffs were developed by comparing a psychometrically defined non-malingering group with three psychometrically defined malingering groups (probable, probable to definite, and definite malingering) and a group that combined all malingering groups. The participants in this research were drawn from a military sample consisting largely of patients with traumatic brain injury (mostly mild traumatic brain injury). Results: Specificities for cutoffs of at least 0.90 are provided. Sensitivities, predictive values, and likelihood ratios are also provided. Conclusions: RBS had the largest mean effect size (d) when the malingering groups were compared to the non-malingering group (d range = 1.23 1.58). Keywords: MMPI-2; MMPI-2-R; Malingering; Military; Symptom validity Introduction There is a body of evidence (Nelson, Sweet, & Heilbronner, 2007; Gervais, Ben-Porath, Wygant, & Green, 2008; Whitney, Davis, Shepard, & Herman 2008; Gervais, Ben-Porath, Wygant, Green, & Sellbom, 2010; Nelson, Hoelzle, Sweet, Arbisi, & Demakis, 2010; Jones & Ingram, 2011; Youngjohn, Wershba, Stevenson, Sturgeon, & Thomas, 2011; Peck et al., 2013) that demonstrates Symptom Validity Tests (SVTs) specifically designed to assess the validity or credibility of self-reported cognitive and/or somatic symptoms perform better in a variety of clinical and forensic settings than the validity scales that assess overreporting of emotional distress or psychopathology on the Minnesota Multiphasic Personality-2 (MMPI-2; Butcher, Dahlstrom. Graham, Tellegen, & Kaemmer, 1989) and the related but restructured set of scales on the MMPI-2-RF (Ben-Porath & Tellegen, 2008). Results of a principal component analysis in a military sample by Jones and Ingram (2011) also indicate that the Cognitive-Somatic Validity Scales and validity scales related to psychopathology load on distinctly different components. Because of the differences in content and apparent superiority of the cognitive-somatic scales in the context of neuropsychological evaluations, additional research is indicated with application in a military sample. Cutoff scores have clinical utility and have been established in a variety of settings for the MMPI-2 and MMPI-2-RF Cognitive-Somatic SVTs (C-S SVTs), but no research has specifically addressed cutoffs to predict malingering for these scales in a military sample. The purpose of this This research was approved through the Institutional Review Board at Womack Army Medical Center. The views expressed herein are those of the author and do not reflect the official policy of the Department of the Army, Department of Defense, or the U.S. Government. Published by Oxford University Press 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US. doi:10.1093/arclin/acw035 Advance Access publication on 12 July 2016

A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 787 research is to examine cutoffs that may be useful for diagnosing psychometrically determined malingering at three levels (Probable [PM], Probable to Definite [PDM], and Definite [DM]) and a group that combines all malingering groups (CM). The scales that assess non-credible reporting of psychopathology or emotional distress include the MMPI-2 Infrequency (F), Back F (F B ) scale, and the Infrequency Psychopathology (F P ) scale and the corresponding MMPI-2-RF scales (F-r and F P -r). The C-S SVTs on the MMPI-2 include the Fake Bad Scale (FBS; Lees-Haley, English, & Glenn, 1991), now named the Symptom Validity Scale, and the Response Bias Scale (RBS; Gervais, Ben-Porath, Wygant, & Green, 2007). The MMPI- 2-RF revised version of the MMPI-2 FBS (FBS-r) retains 30 of the original 43 FBS items. RBS remains intact on the MMPI- 2-RF. The MMPI-2-RF also includes a scale specifically designed to assess over-reporting of somatic complaints, the Infrequent Somatic Responses scale (Fs). Another C-S SVT, the Henry Heilbronner Index (HHI; Henry, Heilbronner, Mittenberg, & Enders, 2006) was developed for the MMPI-2, and a revised but shorter version, the HHI-r (11 vs. 15 items), was recently developed for the MMPI-2-RF (Henry, Heilbronner, Algina, & Kaya, 2013). Neither the HHI nor the HHI-r have been adopted by the publisher of the MMPI-2 and MMPI-2-RF. Jones and Ingram (2011) demonstrated that in a military sample FBS, RBS, and FBS-r, and HHI performed better than the MMPI-2 F-family of scales assessing psychopathology in predicting performance credibility on neurocognitive tests. Because HHI performed the best in many respects in this sample, it is included in the current research. FBS was the first CS-SVT adopted by the publisher of the MMPI-2. The items on this scale were selected rationally on a content basis using unpublished frequency counts of malingerers MMPI test responses and observations of personal injury malingerers (Lees-Haley, English, & Glenn, 1991; p. 204). Items were selected that were thought to assess simultaneously exaggerated post-injury distress and under-reporting of pre-incident personality problems. A meta-analysis of FBS was completed in 2006 (Nelson, Sweet, & Demakis, 2006), and it was updated in 2010 (Nelson, Hoelzle, Sweet, Arbisi, & Demakis, 2010). The authors of this meta-analysis concluded that FBS differentiated groups as well as, and was at times superior, to other MMPI-2 validity scales, including all of F-family scales that assess psychopathology. This included TBI patients. A review of cutoff scores for FBS in multiple settings and groups (TBI, nontraumatic brain disease, psychiatric patients, and other groups) by Greiffenstein, Fox, and Lees-Haley (2007) concluded that a cutoff of 23 (specificity = 0.90) justifies concern about the validity of self-reported symptoms. A specificity of 0.90 has been considered a minimally acceptable level to establish cutoff scores (Victor, Boone, Serpa, Buehler, & Ziegler, 2009; Larrabee, 2012a) and is used throughout the current research to examine cutoff scores. Greiffenstein and coworkers indicated that one should be mindful of moderating variables, such as head injury severity, when using FBS cutoff scores; however, they indicate that in cases of head injury with negative radiologic findings, as is the case in the sample used for the current research, then cutoff scores of 23 24 are grounds for suspecting exaggeration. A cutoff of 23 for men and 26 for women suggests possible malingering for the MMPI-2 (Ben-Porath, Graham, & Tellegen, 2009), which corresponds to a T-score of 80 for both cutoffs. Greiffenstein and coworkers found that a raw score of 30 or greater never or rarely produced false-positive errors, which corresponds to an MMPI-2 T-score of about 100 for men and 90 for women. Research subsequent to that of Greiffenstein, Fox, and Lees-Haley (2007) found that slightly higher cutoffs might be needed to meet recommended minimal levels of specificity. However, this research may not have used well-differentiated groups with respect to possible malingering status, and FBS may not be as sensitive or specific to a diagnosis of malingering in this case. For example, Tsushima, Geling, and Fabrigas (2011) compared a sample of predominantly mtbi litigating and/or compensation-seeking patients and non-litigating patients primarily with emotional or behavioral problems, health conditions, or both that were suspected to be influenced by psychological factors. No validity tests were used to classify patients into either group. Tsushima and coworkers found an FBS cutoff of 25 produced a specificity of 0.91. Dionysus, Denney, and Halfaker (2011) also found a cutoff of 25 was necessary to obtain a specificity of at least 0.90 in sample of litigating or disabilityseeking head-injured patients who failed at least one performance validity test (PVT) and a group of head-injured litigating patients who failed no PVTs. PVTs are designed to assess the credibility of performance on neurocognitive tests and not selfreported symptoms as do SVTs (Larrabee, 2012a). The possibility of exaggeration still exists in the latter group in the research by Dionysus and coworkers given that they were litigating. Peck et al. (2013) examined three groups, including a valid TBI group, a psychogenic non-epileptic seizure group, and an invalid TBI group, which were classed as PM based on Slick, Sherman, and Iverson (1999) criteria. They found an FBS cutoff 27 produced a false-positive classification rate of 7% with 1 failure on PVTs in their valid TBI group. However, not only did some in this presumed valid group fail a PVT, the authors indicate that participants in the presumed valid group had substantial external incentives. Thus, the cutoff established in this research might have been somewhat high because there were possible malingerers in this presumed valid group. Research (Fox, 2011) suggests that failure on even a single PVT can invalidate expected brain behavior relationships that underlie neurocognitive tests interpretations, and research by Proto et al. (2014) concluded that failure on even one PVT should raise concerns about performance validity, especially in individuals with mtbi. Thus, failure on one PVT test failure may raise questions about possible malingering or at least non-credible performance on neuropsychological tests for other reasons, such as a Somatic Symptom Disorder (American Psychiatric Association, 2013) perhaps in the form of a Cogniform Disorder (Delis & Wetter, 2007).

788 A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 The revision of FBS on the MMPI-2-RF (FBS-r) correlates in the upper 90s with scores on the MMPI-2 version of the scale in samples that included large numbers of individuals who failed SVTs (Tellegen & Ben-Porath, 2008). Jones and Ingram (2011) found a correlation of 0.95 between FBS and FBS-r in a military sample. With respect to cutoffs, Ben-Porath and Tellegen (2011) indicate that a T-score of 80 99 (raw score range 17 23) suggests possible over-reporting of memory complaints, and a T-score 100 (raw score = 24) suggests likely over-reporting of memory complaints and limits the interpretability of the MMPI-2-RF Cognitive Complaints scale (COG). Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found the same T-score ( 100) of was necessary to differentiate an incentive only vs. a probable to definite Malingered Neurocognitive Disorder (MND) group based on Slick, Sherman, and Iverson (1999) criteria in a non head-injured sample. This is slightly higher than the cutoff of 21 that Schroeder et al. (2012) found in differentiating a litigating group that failed Slick and coworkers criteria from a group of mostly mtbi patients who passed these criteria. RBS was initially developed by Gervais (2005) but later modified. The items for both the original and modified versions are on both the MMPI-2 and MMPI-2-RF. RBS was developed using an empirical keying methodology to detect symptom complaints associated with cognitive response bias and over-reporting in forensic neuropsychological and disability assessment settings. Regression analysis was used to identify MMPI-2 items that predicted failure on PVTs. RBS consists of 28 MMPI-2 items that discriminated between non head-injured disability claimants who passed or failed tests designed to detect effort on cognitive tests (Gervais, Ben-Porath, Wygant, & Green, 2007). The initial validation research for the revised RBS (Gervais, Ben-Porath, Wygant, & Green, 2007) involved mostly non head-injured disability claimants. This research suggested that RBS was more accurate in detecting inadequate effort on tests of cognitive functioning than the MMPI-2 F-family scales that assess psychopathology in this type of forensic disability assessment setting. This initial research also found a cutoffs of 16 and 17 produced specificities of 0.89 and 0.95 in classifying failure on the Medical Symptom Validity Test (MSVT; Green, 2004), Word Memory Test (WMT; Green, 2003), or both. Three studies have examined cutoffs using criteria developed by Slick, Sherman, and Iverson (1999). Peck et al. (2013) found cutoff of 16 was adequate in their valid TBI group, which is the same as the cutoff that Schroeder et al. (2012) found comparing their TBI patients who met or failed Slick and coworkers criteria. Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found a T-score of 100 (raw score 17) comparing their sample of non head-injured incentive only and PM/DM group. In samples using military veterans, similar or just slightly higher cutoffs have been found. Whitney, Davis, Shepard, and Herman (2008) found a cutoff of 17 was adequate (specificity = 0.92) in predicting failure on the TOMM in a mixed (TBI, neurologic disorders, psychiatric disorders) Veterans Administration clinical sample. Whitney (2013) using a larger sample from her practice found a similar cutoff of 18 predicted failure on the TOMM and MSVT at specificities of 0.93 and 0.94, respectively. Young, Kearns, and Roper (2011) found cutoff of 19 was needed to obtain acceptable specificity (0.91) in distinguishing pass fail status on the WMT in sample of military veterans in recent, current, or upcoming compensation evaluations. Other research using non-military samples has found slightly lower cutoff scores. Wygant et al. (2010) found a cutoff of 15 (T = 90; specificity = 0.91) was adequate in classifying disability claimants who failed either the WMT or TOMM vs. a group of claimants who passed both PVTs. Dionysus, Denney, and Halfaker (2011) found that similar RBS cutoffs of 14 or 15 both produced a specificity of 0.93, but the 14 cutoff had better sensitivity. Using criterion groups based on litigation status (not validity tests), Tsushima Geling and Fabrigas (2011) found a cutoff of 13 was necessary to obtain a specificity of at least 0.90. The results of their receiver operating characteristic and area under the curve analysis indicated that RBS outperformed F, F P, FBS, and HHI in identifying the litigating patients. The MMPI-2-RF technical manual (Tellegen & Ben-Porath, 2008) indicates that F S consists of 16 items that were endorsed by 25% or less of women and men in several large samples of medical patients and a chronic pain patient sample. The total sample included over 55,000 patients. It is described as a scale that assesses over-reporting of somatic symptoms (Ben-Porath & Tellegen, 2008). However, it has been shown to be as sensitive to memory complaints as FBS-r (Gervais, Ben-Porath, Wygant, Green, & Sellbom, 2010). The technical manual suggests that T-scores in the range of 80 99 (raw score range 5 7) indicate possible over-reporting of somatic problems, and scores 100 (raw score 8) indicates over-reporting of somatic problems and possible invalidity of scores on the MMPI-2-RF Somatic Scales. Tarescavage, Wygant, Gervais, and Ben-Porath (2013) found T-score of 100 (raw score 8) had acceptable specificity when comparing their incentive only and PM/DM group. Schroeder et al. (2012) found a slightly lower raw score of 6 was sufficient to differentiate their TBI patients. HHI is a subset of items empirically derived from FBS and the pseudoneurologic scale (PNS) developed by Shaw and Matthews (1965). The items on the HHI were derived by comparing personal injury litigants and disability claimants (79% with mtbi; 5% with moderate-to severe TBI) who met Slick, Sherman, and Iverson (1999) criteria for PM with non-litigating mtbi (85%) and moderate-to-severe (15%) head-injured controls (Henry, Heilbronner, Mittenberg, & Enders, 2006). As a result of their research comparing the FBS, PNS, and HHI, Henry and coworkers, concluded that HHI, which they termed a pseudosomatic factor, was a purer measure of somatic malingering than the FBS and PNS. They found that a cutoff of

A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 789 8 had an acceptable classification accuracy of 0.86 with a specificity of 0.89. A cutoff of 9 produced a specificity of 0.95, and a score 13 was associated with a positive predictive value (PPV) of 100%. Subsequent HHI research suggests that higher cutoffs may be needed to obtain acceptable specificity. Dionysus, Denney, and Halfaker (2011) also using Slick, Sherman, and Iverson (1999) criteria and a sample of head-injured patients found a cutoff of 12 (specificity = 0.93) was needed to differentiate a group of probable to definite malingerers and a group of head-injured patients who passed validity tests and did not meet Slick and coworkers criteria for malingering. Other research using less stringent criteria to form groups has found similar or slightly higher cutoffs. Tsushima Geling and Fabrigas (2011) found a cutoff of 11 was needed to differentiate litigating from clinical patients to obtain a specificity of 0.90. Young, Kerns, and Roper (2011) found a cutoff of found a cutoff of 14 was necessary to differentiate a group of compensation-seeking vs. non compensation-seeking military veterans at a specificity of 0.85. They found a nonsignificant relationship between HHI and the WMT and did not calculate cutoff scores for HHI. The relatively higher cutoff in differentiating the compensation-seeking and non compensation-seeking patients may be because more participants in the non compensation-seeking group (53.4%) failed the WMT than did participants in the compensation-seeking group (46.5%; Heilbronner & Henry, 2013). In addition, almost all individuals in a Veterans Administration sample, as in an active duty military sample, have potential incentive to malinger. Thus, the rate of failure on the WMT in both groups and well as potential incentive to malinger could have, as Heilbronner and Henry point out, artificially inflated HHI cutoffs in order to obtain acceptable specificity. However, Whitney (2013) found that the same cutoff of 14 was necessary to obtain specificities of 0.95 and 0.96 to differentiate those passing or failing the Test of Memory Malingering (TOMM; Tombaugh, 1996) and MSVT(Green, 2004). Earlier research by Whitney, Davis, Shepard, and Herman (2008) using a smaller sample in the same setting found a similar cutoff of 13 was needed to differentiate those who passed for failed the TOMM (specificity = 0.92). Henry, Heilbronner, Algina, & Kaya (2013) found that a cutoff of 7 for HHI-r produced a sensitivity of 0.69 and a specificity of 0.93 in differentiating personal jury and disability litigants vs. nonlitigating head-injured patients. This initial research suggests the HHI-r may have some promise. Materials and Methods Participants The sample used in this research is the same as used by Jones (2013a, 2013b), and the methodology in the current research parallels that used in the previous research. All participants were active duty military members who were consecutive referrals to the author and evaluated in the brain injury medicine or neuropsychology services at two army medical centers. The participants completed a neuropsychological evaluation that included the MMPI-2 and at least one PVT. Participants with TRIN or VRIN 80 were excluded. A cannot say scale (CNS) > 18 was also used as an exclusion criterion; however, no one had a CNS score >17. The initial sample consisted of 495 participants; after exclusion criteria for the MMPI-2 were applied, the final sample consisted of 462 participants. After the non-malingering (NM) and three malingering groups (PM, PDM, and DM) were formed (groups are described in the following), the final sample consisted of 300 participants. There were 145 participants in the NM group and 155 in the three malingering groups. The sample was 82.7% men; the mean age was 31.6 years (SD = 9.0), and the mean education level was 13.1 years (SD = 2.0). The ethnic distribution was 72.6% Caucasian, 16.7% Afro-American, 10.3% Hispanic, and 0.4% was Asian or other. There were no significant differences in age or education or in the distribution of gender or ethnicity across the comparison groups. The majority of the participants were evaluated for closed head injuries, blast exposure, and heat injuries, or some combination of these injuries. Because of the retrospective nature of the data, the exact distribution of the severity and nature of head injury was not available for the final sample. However, data were available for about two-thirds of the initial sample but unfortunately not linked to individual participants. About 90% of the sample experienced closed head, blast, or heat injuries or some combination of the three. Approximately 75% of these injuries were estimated to be mild, 19% were moderate, and 6% were severe. About 10% of the sample was evaluated for brain disease (e.g., multiple sclerosis, epilepsy, Huntington s disease, etc.). Criteria used to judge the severity of TBI were based in large part on the Department of Veterans Affairs/ Department of Defense 40 consensus-based classification of closed TBI severity (Department of Veteran Affairs, 2009). However, information for each criterion was not always available (e.g., Glasgow Coma Scale ratings and brain-imaging studies at time of injury), so the severity ratings were primarily based on criteria related to length of loss of consciousness, length of alteration of consciousness (e.g., feeling dazed, disoriented, confused, or difficulty mentally tracking events), and length of post-traumatic amnesia.

790 A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 PVT Cutoff Scores Five PVTs were used for this research to establish comparison groups. The PVTs included two embedded and three freestanding tests. The freestanding tests included the Victoria Symptom Validity Test (VSVT; Slick, Hopp, Strauss, & Thompson, 1997), the TOMM (Tombaugh, 1996), and the WMT (Green, 2003). The embedded PVTs included the Effort Index (EI) for the Repeatable Battery of Neuropsychological Status (RBANS; Randolph, 1998) and Reliable Digit Span (RDS; Babikian, Boone, Lu, & Arnold, 2006). Greater detail concerning the characteristics of the validity tests, such as the nature of the stimuli and administration procedures, are not presented herein the interest of maintaining test security and deterring coaching. The standalone PVTs were administered and scored by computer using standard instructions. Cutoffs for determining failure on the WMT were based on the test manual, i.e., 82.5% on IR, DR, or CNS. The RBANS EI was calculated by procedures described by Silverberg, Wertheimer, and Fichtenberg (2007). The cutoff for determining failure on the RBANS EI was based on the research by Armistead-Jehle and Hansen (2011) for a military sample. The cutoff used ( EI 1) had the highest sensitivity while maintaining a very low false-positive rate. A cutoff of 7 was used for RDS (Babikian et al.; Greiffenstein, Baker, & Gola, 1994; Jasinski, Berry, Shandera, & Clark, 2011). For the TOMM, the cutoffs used were 43 for Trial 1 and 49 for the other two trials. The cutoffs all had PPVs of 0.90 or greater for a base rate of 0.40 for PM, PDM, and DM groups used in the research by Jones (2013a). That research suggested that a base rate of 0.41 for PM based on failure of two or more validity tests in the military sample used in that research. The use of these nonstandard cutoffs for the TOMM is consistent with the findings of Stenclik, Miele, Silk-Eglit, Lynch, and McCaffrey (2013). They concluded that their research supported the use of a cutoff of 39 for Trial 1 and acutoffof<49 for Trial 2 and the Retention Trial in a sample of mtbi patients. Cutoffs used for the VSVT for the current research were 20, 18, and 41 for the Easy, Hard, and Total Scores, respectively. Research by Jones (2013b) indicated these cutoffs for the Hard and Total scores produced PPVs of at least 0.90 at a base rate of 0.40 for a PM and PDM groups. The cutoff for the easy items for the PM had a PPV of 0.88. Cutoffs of <18 and <41 for the Hard and Total scores, respectively, were found to have the best classification accuracy for mtbi patients in the recent research by Silk-Eglit, Lynch, and Mccaffrey (2016). In general, the research cited earlier by Stenclik and coworkers and Silk-Eglit and coworkers as well as the cutoffs established in the Jones research on the TOMM and VSVT in predominantly mtbi sample and used in the current research support the use of nonstandard cutoffs. They are also consistent with cutoffs established in other research, as reviewed in the Jones research, using a variety of other samples (e.g., Grote et al., 2000; Greve, Bianchini, & Doane, 2006; Greve et al., 2006; Macciocchi, Seel, Alderson, & Godsall, 2006; Loring, Larrabee, Lee, & Meador, 2007; Greve, Etherton, Ord, Bianchini, & Curtis, 2009). Comparison Groups Four groups were used to establish cutoff scores for the MMPI-2 and MMPI-2-RF C-S SVTs. The PM group was based on failure of exactly two PVTs. The PDM group was based on failure of three or more PVTs, and the DM was composed of participants who performance on the VSVT was significantly below chance. All participants who performed significantly below chance on the WMT or TOMM also scored significantly below chance on the VSVT; three participants scored below chance on the WMT and four on the TOMM. The fourth malingering group included all participants in the PM, PDM, and DM groups (combined malingering group). The group thought not to be malingering (NM group) failed no PVTs administered to them and were administered at least two PVTs. The formation of the PM, PDM, and DM groups was based in large part on the simplification and refinements suggested by Larrabee, Greiffenstein, Greve, and Bianchini (2007) and Larrabee (2008) of the Slick, Sherman, and Iverson (1999) criteria that have been widely used to diagnose MND. The Slick and coworkers criteria use a multidimensional and multimethod approach including evidence from neuropsychological testing and evidence from self-report. Larrabee and coworkers concluded that the Slick and coworkers criteria related to evidence from neuropsychological testing could be modified and simplified to allow multiple psychometric findings to define different levels of malingering regardless of other Slick and coworkers criteria. Larrabee (2008) stated that failure on two independent PVTs provides strong evidence for a diagnosis of PM (posttest probability =.90+), and failure on three PVTs provides very strong evidence of probable, if not definite, malingering. Larrabee and coworkers state that failure on three well-validated validity indicators appears to be associated with 100% probability of malingering and is statistically equivalent to definite MND (p. 357). However, they also indicate that although this is associated with 100% probability of malingering, scores at this level are not conceptually equivalent to the active avoidance of correct answers as associated with significantly worse-than-chance-performance on two-alternative forced choice testing, which would indicate conscious intent and DM. Of the 462 participants available, after MMPI-2 exclusion criteria were applied 395 completed at least two validity tests or failed the VSVT significantly below chance. Of these 395 participants, 155 failed at least two PVTs (or VSVT below

A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 791 chance) and could be used to form the three malingering groups. The PM group was composed of 83 participants, and for this group, 62 of 71 failed the VSVT, 68 of 73 failed the TOMM, 10 of 50 failed the EI, 16 of 23 failed RDS, and 10 of 12 failed the WMT. Of the 44 individuals who met criteria for inclusion in the PDM group, 42 of 43 failed the VSVT, 43 of 43 failed the TOMM, 37 of 41 failed the EI, 10 of 12 failed RDS, and 10 of 10 failed the WMT. The DM group was composed of 28 participants who failed the VSVT at significantly below chance levels. The NM group was composed of 145 participants; 100 were administered the TOMM, 119 the VSVT, 56 RDS, 12 the WMT, and the RBANS EI was scored for 78 participants. Data Analysis The data analysis proceeded in four steps. First, it was necessary to establish that the validity measures were independent (i.e., not redundant) to ensure the validity of the composition of the three groups composed of those thought to be malingering. Larrabee (2008) and Nelson et al. (2003) used correlational analysis across the full range of scores to establish validity tests independence. However, it can be argued that for the current research the primary concern is establishing if there is an association between either passing or failing a validity test within a comparison group and not the association between the full range of scores on each validity test. This is important because, e.g., the PM group was composed of participants who failed exactly two validity tests in any combination. If the two tests failed were redundant, then it would be the equivalent to failing one test. However, when nonredundant tests are used, there is greater certainty that the individuals who were placed in the PM group were correctly classified, i.e., failed two nonredundant validity tests. This reasoning also applies to the PDM group. To establish independence for the PVTs used in this research, a chi-square analysis was completed to assess the association between the pass fail status for each the five PVTs used to establish the malingering comparison groups. Of the 462 participants in the total sample, 153 failed two or more PVTs and could be used for chi-square analysis. Two participants of the 155 participants in the comparisons groups failed only the VSVT below chance, i.e., they did not fail two PVTs. The second analysis involved comparing the means of the three comparison groups in terms of standardized units. This allowed for comparison with other research reporting effect sizes. The third analysis involved an examination of classification accuracy in terms of sensitivity and specificity for a range of cutoffs for the C-S SVTs. Cutoffs were calculated for cutoffs with specificities.90 and terminated when specificities reached 1.0 or when the maximum raw score for scale was reached. The sensitivity of a test is the proportion of people with a COI (condition of interest), in the case of this research malingering, who have a positive result (true positives). The specificity of a test is the proportion of people without the COI who have a negative result (1 specificity = false-positive rate). Sensitivity and specificity inform us of the accuracy of a test and not the probability that someone has the COI. The final analyses for the current research included calculation of PPVs, negative predictive values (NPVs), and the likelihood ratios for positive (LR+) and negative (LR ) test results. PVs and LRs are important in establishing post-test probabilities of having COI. The PPV of a test is defined as the proportion of people with a positive test result who actually have the COI. The NPV of a test is the proportion of people with a negative test result who do not have the COI. PVs are dependent on the prevalence or base rate of the COI (see Crawford, Garthwaite, and Betkowska (2009) for formulae for calculations of PVs). PVs for base rates ranging from 0.10 to 0.50 are provided in the current research and provide information about the probability of malingering at a given base rate. LR+ is defined as the probability of an individual with a COI having a positive test divided by the probability of an individual without the COI having a positive test (true positives/false positives). For example, if 80% of patients with the COI have a true positive test and only 6% of those who do not have the COI have a positive test (false positive), then the LR+ for the ability of the test to detect the COI is 13 (80%/6%). This indicates that a person with the COI is 13 times more likely to have a positive test than a person who does not have the COI. If the probability of having a positive test were the same in those with and without the COI, then the LR would be 1. This would indicate the test is not useful in differentiating the two groups. Individuals with a COI should be much more likely to have an abnormal test result than individuals without the COI. The likelihood ratio for a negative test (LR ) is defined as the probability of an individual with a COI having a negative test divided by the probability of an individual without the COI having a negative test (1 sensitivity/specificity). A LR greater than 1 indicates that a negative test is more likely to occur in people with the COI than in people without the COI. A LR less than 1 mean that a negative test is less likely to occur in people with the COI compared to people without the COI (Grimes & Schulz, 2005; Bowden & Loring, 2009). An example of the calculation of post-test probabilities of a COI can be found in Jones (2013b), and online calculators are readily available to simplify the process. The Fagan (1975) nomogram, which is also readily available online or in text books (e.g., Sackett, Haynes, Guyatt, & Tugwell, 1991), can also be used to provide a fast and easy estimate of a post-test probability.

792 A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 Table 1. Chi-square analysis for validity test independence: two or more failed performance validity tests VSVT EI WMT RDS χ 2 ϕ p a N χ 2 ϕ p N χ 2 ϕ p N χ 2 ϕ p N TOMM 0.347 0.05.556 128 0.750 0.09.387 99 0.186 0.10.666 20 b 20 VSVT 2.37 0.15.124 107 1.13 0.22.289 24 0.134 0.09.714 15 EI 0.046 0.05.830 21 0.444 0.33.505 4 WMT 0.133 0.58.248 4 Note: VSVT = Victoria Symptom Validity Test, EI = Repeatable Battery for the Assessment of Neuropsychological Status Effort Index, WMT = Word Memory Test, RDS = Reliable Digit Span, TOMM = Test of Memory Malingering. a The probability levels are based on χ 2. There is no difference in the pattern of significant and nonsignificant results when compared to Fisher s Exact Test. b Values could not computed because failure on the TOMM is a constant, i.e., of the 20 participants who met inclusion criteria for this analysis, 14 failed and 6 passed RDS and all participants failed the TOMM. Table 2. Chi-square analysis for validity test independence: one or more failed performance validity test Results VSVT EI WMT RDS χ 2 ϕ p a N χ 2 ϕ p N χ 2 ϕ p N χ 2 ϕ p N TOMM 0.387 0.05.534 180 4.18 0.17.041 138 0.538 0.14.463 26 7.70 0.37.006 57 VSVT 6.98 0.22.008 151 0.534 0.12.465 36 0.201 0.08.654 30 EI 0.411 0.12.521 28 0.900 0.32.343 9 GWMT 1.74 0.47.187 8 Note: VSVT = Victoria Symptom Validity Test, EI = Repeatable Battery for the Assessment of Neuropsychological Status Effort Index, WMT = Word Memory Test, RDS = Reliable Digit Span, TOMM = Test of Memory Malingering. a The probability levels are based on χ 2. There is no difference in the pattern of significant and nonsignificant results when compared to Fisher s Exact Test. The results of the chi-square analysis using failure on two or more PVTs to assess redundancy (Table 1) indicate that there are no statistically significant associations between passing or failing the PVTs used in this research for those who failed PVTs. However, some of the analyses had very small cell sizes or empty cells. In an attempt to address this problem, an additional analysis (Table 2) was completed using participants failing one or more PVTs rather than two or more PVTs used in the initial analysis. Although some associations were statistically significant in this second analysis, the results of this analysis also indicate that PVT redundancy is not a problem. Where there are statistically significant associations (TOMM and EI, VSVT and EI, TOMM and RDS), the amount of shared variance was minimal 0.03, 0.05 and 14%, respectively. Hinkle, Wiersma, and Jurs (2003) suggest that correlations in the range of 0.00 0.30 (0.0 0.09% shared variance) indicate little if any correlation between variables and correlations in the range of 0.30 0.50 (0.09 0.25% shared variance) indicate low correlations. Based on the two chi-square analyses, it appears that the relationship between the PVTs is minimal, and the formation of the malingering groups is valid, i.e., not based on redundant PVTs. The effect size analysis (Table 3) indicates that there were large effect sizes based on Cohen s (1988) general recommendations for classifying standardized effects sizes for all C-S SVTs across all malingering groups. RBS performed best overall based on the mean effect size for the PM, PDM, and DM groups. Based on Ferguson s (2009) criteria for moderate effect size (practically significant = 0.41, moderate = 1.15; strong = 2.70), only RBS had at least moderate effect sizes for all malingering groups. All scales had moderate effect sizes for the PDM and DM groups (range, 1.15 2.01) based on Ferguson s criteria. The results for cutoff scores for the C-S SVTs with a minimum specificity of 0.90 are provided in Tables 4 9. Calculation of cutoffs was terminated when specificities reached 1.0 or the maximum score on the scale. These tables provide statistics related to test accuracy, PVs, and LRs. It should be noted that a value of appears in the LR column for some cutoffs. This is because a value of, an undefined value, results when a zero enters into the calculation of the LR. For example, for cutoff of 21 for FBS (Table 4) the LR+=Sensitivity/(1 Specificity), which results in a LR+ =0.14/(1 1) or LR+ =0.14/0. It should also be noted that the calculation of LRs was based on sensitivities and specificities to five decimal places, not the rounded values provided in this table. Sensitivities and specificities were calculated at three decimal places. Sensitivities, specificities, and PPV will be discussed here and LRs will be explored in the Discussion section. PPVs discussed in this section will be based primarily on a base rate of 0.40 that is similar to estimates of malingering found in litigating TBI patients and in military and other samples (Mittenberg, Patton, Canyock, & Condit, 2002; Larrabee, 2007; Larrabee, Millis, & Meyers, 2009; Jones, 2013b).

A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 793 Table 3. Descriptive statistics and effect sizes for the probable malingering (PM), probable to definite (PDM), definite malingering (DM), and combined groups vs. non-malingering (NM) groups for the MMPI-2 and MMPI-2-RF Cognitive-Somatic Validity scales Group NM (N = 145) PM (N = 83) PDM (N = 44) DM (N = 28) Combined a Effect Size Cohen s d for NM vs. M SD M SD M SD M SD M SD PM PDM DM Combined M b RBS 8.59 4.08 13.35 3.87 14.80 4.03 16.61 3.39 14.35 4.01 1.19 1.53 2.01 1.42 1.58 FBS 16.24 5.38 21.69 5.38 24.25 6.59 24.89 5.83 22.99 5.96 1.01 1.41 1.59 1.19 1.34 FBS-r 10.48 4.01 14.60 3.99 16.93 4.97 17.50 4.02 15.79 4.46 1.03 1.52 1.75 1.25 1.43 Fs 2.10 2.04 4.06 2.67 4.73 2.99 5.86 3.04 4.57 2.90 0.85 1.15 1.69 0.98 1.23 HHI 5.52 3.68 9.35 3.27 11.09 3.10 11.93 1.82 10.31 3.18 1.08 1.57 1.86 1.40 1.50 HHI-r 3.91 2.76 6.75 2.48 8.11 2.24 8.71 1.51 7.49 2.40 1.07 1.59 1.84 1.39 1.50 Note: RBS = Response Bias Scale, FBS = MMPI-2 Symptom Validity Scale, FBS-r = MMPI-2-RF Symptom Validity Scale, Fs = Infrequent Somatic Responses Scale, HHI = MMPI-2 Henry Heilbronner Index, HHI-r = MMPI-2-RF Henry Heilbronner Index. a This group is composed of all participants in the PM, PDM, and DM groups (N = 155). b Mean effect size for the PM, PDM, and DM groups. Table 4. Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the response bias scale (RBS) Cutoff a Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test Definite Malingering RBS 15 0.71 (0.51 0.86) 0.93 (0.87 0.96) 0.54 0.72 0.82 0.87 0.91 0.97 0.93 0.88 0.83 0.77 10.4 (5.4 19.7) 0.31 (0.17 0.55) RBS 16 0.57 (0.37 0.75) 0.94 (0.89 0.97) 0.54 0.72 0.82 0.87 0.91 0.95 0.90 0.84 0.77 0.69 10.4 (4.9 21.8) 0.45 (0.30 0.70) RBS 17 0.43 (0.25 0.63) 0.96 (0.91 0.98) 0.54 0.72 0.82 0.87 0.91 0.94 0.87 0.80 0.72 0.63 10.4 (4.2 25.3) 0.60 (0.43 0.82) RBS 18 0.39 (0.22 0.59) 0.96 (0.91 0.98) 0.51 0.70 0.80 0.86 0.90 0.93 0.86 0.79 0.70 0.61 9.5 (3.8 23.6) 0.63 (0.47 0.85) RBS 19 0.25 (0.11 0.45) 0.97 (0.93 0.99) 0.50 0.69 0.80 0.86 0.90 0.92 0.84 0.75 0.66 0.56 9.1 (2.8 28.9) 0.77 (0.62 0.96) RBS 20 0.25 (0.11 0.45) 0.98 (0.93 0.99) 0.57 0.75 0.84 0.89 0.92 0.92 0.84 0.75 0.66 0.57 12.1 (3.3 43.9) 0.77 (0.62 0.95) RBS 21 0.14 (0.05 0.34) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.73 0.64 0.54 0.86 (0.74 1.0) Probable to Definite Malingering RBS 15 0.55 (0.39 0.69) 0.93 (0.87 0.96) 0.47 0.66 0.77 0.84 0.89 0.95 0.89 0.83 0.75 0.67 7.9 (4.1 15.2) 0.49 (0.35 0.68) RBS 16 0.50 (0.35 0.65) 0.94 (0.89 0.97) 0.50 0.69 0.80 0.86 0.90 0.94 0.88 0.82 0.74 0.65 9.0 (4.3 18.9) 0.53 (0.39 0.71) RBS 17 0.34 (0.21 0.50) 0.96 (0.91 0.98) 0.48 0.67 0.78 0.85 0.89 0.93 0.85 0.77 0.69 0.59 8.2 (3.4 20.0) 0.69 (0.56 0.85) RBS 18 0.23 (0.12 0.38) 0.96 (0.91 0.98) 0.38 0.58 0.70 0.79 0.85 0.92 0.83 0.74 0.65 0.55 5.5 (2.1 14.3) 0.81 (0.69 0.95) RBS 19 0.14 (0.06 0.28) 0.97 (0.93 0.99) 0.35 0.55 0.68 0.77 0.83 0.91 0.82 0.72 0.63 0.53 4.9 (1.5 16.7) 0.89 (0.79 1.0) RBS 20 0.09 (0.03 0.23) 0.98 (0.94 0.99) 0.33 0.52 0.65 0.75 0.81 0.91 0.81 0.72 0.62 0.52 4.4 (1.0 18.9) 0.93 (0.85 1.0) RBS 21 0.07 (0.02 0.20) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.81 0.71 0.62 0.52 0.93 (0.86 1.0) Probable Malingering RBS 15 0.43 (0.33 0.55) 0.93 (0.87 0.96) 0.41 0.61 0.73 0.81 0.86 0.94 0.87 0.79 0.71 0.62 6.3 (3.3 12.0) 0.61 (0.50 0.73) RBS 16 0.33 (0.23 0.44) 0.94 (0.89 0.97) 0.40 0.60 0.72 0.80 0.85 0.93 0.85 0.77 0.68 0.58 5.9 (2.8 12.4) 0.71 (0.61 0.83) RBS 17 0.20 (0.13 0.31) 0.96 (0.91 0.98) 0.35 0.55 0.68 0.77 0.83 0.92 0.83 0.74 0.64 0.55 4.9 (2.0 12.1) 0.83 (0.74 0.93) RBS 18 0.14 (0.08 0.24) 0.96 (0.91 0.98) 0.28 0.47 0.60 0.70 0.78 0.91 0.82 0.72 0.63 0.53 3.5 (1.4 9.0) 0.89 (0.82 0.98) RBS 19 0.08 (0.04 0.17) 0.97 (0.93 0.99) 0.25 0.43 0.57 0.67 0.75 0.91 0.81 0.71 0.61 0.52 3.1 (0.9 10.1) 0.94 (0.88 1.0) RBS 20 0.04 (0.01 0.11) 0.98 (0.94 0.99) 0.16 0.30 0.43 0.54 0.64 0.90 0.80 0.70 0.60 0.50 1.7 (0.36 8.5) 0.98 (0.94 1.0) RBS 21 0.01 (0.00 0.07) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.80 0.70 0.60 0.50 0.98 (0.96 1.0) Combined Malingering Groups RBS 15 0.52 (0.43 0.60) 0.93 (0.87 0.96) 0.45 0.65 0.76 0.83 0.88 0.95 0.88 0.82 0.74 0.66 7.5 (4.0 13.9) 0.52 (0.44 0.61) RBS 16 0.42 (0.34 0.50) 0.94 (0.89 0.97) 0.45 0.65 0.76 0.83 0.88 0.94 0.87 0.79 0.71 0.62 7.6 (3.8 15.3) 0.61 (0.54 0.70) RBS 17 0.28 (0.22 0.36) 0.96 (0.91 0.98) 0.43 0.63 0.75 0.82 0.87 0.92 0.84 0.76 0.67 0.57 6.9 (3.0 15.6) 0.75 (0.68 0.83) RBS 18 0.21 (0.15 0.29) 0.96 (0.91 0.98) 0.37 0.57 0.69 0.78 0.84 0.92 0.83 0.74 0.65 0.55 5.1 (2.2 11.9) 0.82 (0.76 0.89) RBS 19 0.13 (0.08 0.19) 0.97 (0.93 0.99) 0.34 0.54 0.66 0.75 0.82 0.91 0.82 0.72 0.63 0.53 4.7 (1.6 13.4) 0.90 (0.84 0.95) RBS 20 0.09 (0.05 0.15) 0.98 (0.94 0.99) 0.32 0.52 0.65 0.74 0.81 0.91 0.81 0.72 0.62 0.52 4.4 (1.3 14.9) 0.93 (0.88 0.98) RBS 21 0.05 (0.02 0.10) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 0.95 (0.91 0.98) a T-score equivalents for raw scores: 15 = 92, 16 = 97, 17 = 101, 18 = 105, 19 = 109, 20 = 114, 21 = 118. With respect to cutoff scores based on specificities, an RBS cutoff of 15 produced a specificity of 0.93 for all malingering groups, and a score of 21 resulted in perfect specificity for all groups. An RBS cutoff of 15 had the highest sensitivity (0.71) for the DM group, and this was the highest sensitivity of all the C-S SVTs examined in this research. None of the RBS cutoffs resulted in a PPV of at least 0.90 at a base rate of 0.40, except for a cutoff of greater than or equal to 21. The PPV at a cutoff

794 A. Jones / Archives of Clinical Neuropsychology 31 (2016); 786 801 Table 5. Sensitivity, specificity, positive/negative predictive value (PPV/NPV), and likelihood ratios for cutoff scores for the MMPI-2 Symptom Validity Scale (FBS) Cutoff Test accuracy (95% CI) PPV for select base rates NPV for select base rates Likelihood ratios (95% CI) Sensitivity Specificity 0.10 0.20 0.30 0.40 0.50 0.10 0.20 0.30 0.40 0.50 Positive test Negative test Definite Malingering FBS 25 0.43 (0.25 0.63) 0.92 (0.87 0.96) 0.39 0.59 0.71 0.79 0.85 0.94 0.87 0.79 0.71 0.62 5.6 (2.8 11.5) 0.62 (0.45 0.85) FBS 26 0.39 (0.22 0.59) 0.97 (0.92 0.99) 0.56 0.74 0.83 0.88 0.92 0.93 0.86 0.79 0.70 0.61 11.4 (4.3 30.3) 0.63 (0.47 0.85) FBS 27/28 0.32 (0.17 0.52) 0.99 (0.95 1.0) 0.72 0.85 0.91 0.94 0.96 0.93 0.85 0.77 0.69 0.59 23.3 (5.3 102.1) 0.69 (0.53 0.89) FBS 29 0.32 (0.17 0.52) 0.99 (0.96 1.0) 0.84 0.92 0.95 0.97 0.98 0.93 0.85 0.77 0.69 0.59 46.6 (6.1 353.4) 0.68 (0.53 0.88) FBS 30 0.18 (0.07 0.38) 0.99 (0.96 1.0) 0.74 0.87 0.92 0.95 0.96 0.92 0.83 0.74 0.64 0.55 25.9 (3.1 213.3) 0.83 (0.70 0.98) FBS 31 0.18 (0.07 0.38) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.74 0.65 0.55 0.82 (0.69 0.98) Probable to Definite Malingering FBS 25 0.52 (0.37 0.67) 0.92 (0.87 0.96) 0.43 0.63 0.75 0.82 0.87 0.95 0.89 0.82 0.74 0.66 6.9 (3.7 13.0) 0.52 (0.38 0.70) FBS 26 0.50 (0.35 0.65) 0.97 (0.92 0.99) 0.62 0.78 0.86 0.91 0.94 0.95 0.89 0.82 0.74 0.66 14.5 (5.8 36.0) 0.52 (0.39 0.70) FBS 27 0.41 (0.27 0.57) 0.99 (0.95 1.0) 0.77 0.88 0.93 0.95 0.97 0.94 0.87 0.80 0.71 0.63 29.7 (7.2 122.9) 0.60 (0.47 0.77) FBS 28 0.34 (0.21 0.50) 0.99 (0.95 1.0) 0.73 0.86 0.91 0.94 0.96 0.93 0.86 0.78 0.69 0.60 24.7 (5.9 103.9) 0.67 (0.54 0.83) FBS 29 0.27 (0.15 0.43) 0.99 (0.96 1.0) 0.81 0.91 0.94 0.96 0.98 0.92 0.85 0.76 0.67 0.58 39.5 (5.3 295.7) 0.73 (0.61 0.88) FBS 30 0.23 (0.12 0.38) 0.99 (0.96 1.0) 0.79 0.89 0.93 0.96 0.97 0.92 0.84 0.75 0.66 0.56 33.0 (4.3 250.4) 0.78 (0.66 0.91) FBS 31 0.20 (0.10 0.36) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.92 0.83 0.75 0.65 0.56 0.80 (0.68 0.92) Probable Malingering FBS 25 0.27 (0.18 0.38) 0.92 (0.87 0.96) 0.28 0.47 0.60 0.70 0.78 0.92 0.83 0.75 0.65 0.56 3.5 (1.8 6.8) 0.80 (0.70 0.91) FBS 26 0.23 (0.15 0.34) 0.97 (0.92 0.99) 0.42 0.62 0.74 0.82 0.87 0.92 0.83 0.75 0.65 0.56 6.6 (2.6 17.1) 0.80 (0.71 0.90) FBS 27 0.23 (0.15 0.34) 0.99 (0.95 1.0) 0.65 0.81 0.88 0.92 0.94 0.92 0.84 0.75 0.66 0.56 16.6 (4.0 69.5) 0.78 (0.70 0.88) FBS 28 0.17 (0.10 0.27) 0.99 (0.95 1.0) 0.58 0.75 0.84 0.89 0.92 0.91 0.83 0.73 0.64 0.54 12.2 (2.8 52.5) 0.84 (0.76 0.93) FBS 29 0.16 (0.09 0.26) 0.99 (0.96 1.0) 0.72 0.85 0.91 0.94 0.96 0.91 0.82 0.73 0.64 0.54 27.7 (3.0 170.5) 0.85 (0.77 0.93) FBS 30 0.08 (0.04 0.17) 0.99 (0.96 1.0) 0.58 0.75 0.84 0.89 0.92 0.91 0.81 0.72 0.62 0.52 12.2 (1.5 97.7) 0.92 (0.86 0.98) FBS 31 0.04 (0.01 0.11) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.90 0.81 0.71 0.61 0.51 0.96 (0.92 1.0) Combined Malingering Groups FBS 25 0.37 (0.29 0.45) 0.92 (0.87 0.96) 0.35 0.55 0.67 0.76 0.83 0.93 0.85 0.77 0.69 0.59 4.8 (2.6 8.9) 0.68 (0.61 0.77) FBS 26 0.34 (0.26 0.42) 0.97 (0.92 0.99) 0.52 0.71 0.81 0.87 0.91 0.93 0.85 0.77 0.69 0.59 9.7 (4.0 23.7) 0.69 (0.62 0.77) FBS 27 0.30 (0.23 0.38) 0.99 (0.95 1.0) 0.70 0.84 0.90 0.93 0.95 0.93 0.85 0.77 0.68 0.58 21.5 (5.3 87.0) 0.71 (0.64 0.79) FBS 28 0.25 (0.18 0.32) 0.99 (0.95 1.0) 0.66 0.81 0.88 0.92 0.95 0.92 0.84 0.75 0.66 0.57 17.8 (4.4 72.3) 0.77 (0.70 0.84) FBS 29 0.22 (0.16 0.29) 0.99 (0.96 1.0) 0.78 0.89 0.93 0.95 0.97 0.92 0.84 0.75 0.66 0.56 31.8 (4.4 229.4) 0.79 (0.72 0.85) FBS 30 0.14 (0.09 0.21) 0.99 (0.96 1.0) 0.69 0.84 0.90 0.93 0.95 0.91 0.82 0.73 0.63 0.54 20.6 (2.8 150.7) 0.86 (0.81 0.92) FBS 31 0.11 (0.07 0.17) 1.0 (0.97 1.0) 1.0 1.0 1.0 1.0 1.0 0.91 0.82 0.72 0.63 0.53 0.89 (0.84 0.94) of greater than or equal to 21 was 1.0 for all malingering groups and at all base rates. This was obtained at a specificity of 1.0, which will of course result in a PPV of 1.0 and an infinitely large LR+. An FBS cutoff of 20 resulted in a specificity of 0.92 across all groups, and a cutoff of 31 resulted in perfect specificities for all groups. An FBS cutoff of 25 produced the highest sensitivity (0.52), which was found for the PDM group. PPVs of at least 0.92 were found for scores 27 across all groups at a 0.40 base rate, except for the PDM group, where there was a PPV of 0.91 for a cutoff of 26. For FBS-r, the cutoffs with a specificity of at least 0.90 ranged from 17 (0.92) to 21 (1.0) and with the highest sensitivity (0.59) for the PDM group (cutoff 17). A cutoff of 20 for the DM group and a cutoff of 18 for the PDM group had PPVs of 0.90. No other cutoffs had a PPV of at least 0.90, except for a cutoff of 2, which had a specificity of 1.0 for all groups resulting in perfect PPVs. The range of cutoff score for Fs was 6 to 10 with specificities greater than or equal to 0.94, and the highest sensitivity (0.46) was for the DM group. PPVs greater than or equal to 0.90 were found for cutoffs 7 for the DM group and 8 for the PDM group (except for a cutoff of 9 for the PDM group; PPV = 0.81). A cutoff of 10 resulted in perfect specificities for all groups. For HHI, cutoffs ranged from 12 to 15 (15 is the maximum score for HHI) with specificities across all groups greater than or equal to 0.94. A cutoff of 15 had specificities of 1.0 for all groups. The highest sensitivity (0.61) occurred for a cutoff of 12 for the DM group. Cutoffs 14 produced PPVs 0.90. The cutoffs for HHI-r ranged from 9 to 11 (11 is the maximum score for HHI-r). Specificities were 0.93 or greater for all cutoffs in all groups, and no cutoffs had a specificity of 1.0. The highest specificity (0.99) was for a cutoff of 11 in all groups. The highest sensitivity was 0.57 for the DM group. PPVs of 0.90 or greater occurred only for a cutoff of 11.