MMPI-2 short form proposal: CAUTION

Archives of Clinical Neuropsychology 18 (2003) 521 527 Abstract MMPI-2 short form proposal: CAUTION Carlton S. Gass, Camille Gonzalez Neuropsychology Division, Psychology Service (116-B), Veterans Affairs Medical Center, 1201 N.W. 16th Street, Miami, FL 33125, USA Accepted 15 May 2002 The Minnesota Multiphasic Personality Inventory-2 (MMPI-2) is widely used in neuropsychology, though its length (567 items) is sometimes prohibitive. This study investigated some psychometric characteristics of the 180-item version of the MMPI-2 (Dahlstrom & Archer, 2000) in order to delineate its strengths, limitations, and appropriate scope of clinical application. Limited reliability and poor predictive accuracy were recently reported for many of the MMPI-2 short-form scales in a study that used 205 brain-injured patients. In the present investigation, we used a psychiatric sample (N = 186) with normal neurological findings to examine short-form accuracy in predicting basic scale scores, profile code types, identifying high-point scales, and classifying scores as pathological (T 65) or normal-range. The results suggest that, even as applied to neurologically normal individuals, the proposed short form of the MMPI-2 is unreliable for predicting clinical code types, identifying the high-point scale, or predicting the scores on most of the basic scales. In contrast, this short form can be used to predict whether the full-scale scores fall within the pathological range (T 65). These findings suggest that clinicians might be able to salvage a small amount of information from the shortened (180-item) version of the MMPI-2 when MMPI-2 protocols are incomplete. However, clinicians should not use a standard interpretive approach with this test, and routine clinical application is unwarranted. Future evaluations of short-form validity should provide a more detailed examination of individual protocols, including an analysis of the frequency of accurate prediction of full-form scores. 2002 National Academy of Neuropsychology. Published by Elsevier Science Ltd. All rights reserved. Keywords: MMPI-2; Brain-injured sample; Short- and full-form scores A shortened version of the Minnesota Multiphasic Personality Inventory-2 (MMPI-2; Butcher, Dahlstrom, Graham, Tellegen, & Kaemmer, 1989), consisting of the first 180 items, was recently devised by Dahlstrom and Archer (2000) and recommended for use under special Corresponding author. Tel.: +1-305-324-3215. E-mail address: gass.carlton@miami.va.gov (C.S. Gass). 0887-6177/02/$ see front matter 2002 National Academy of Neuropsychology. PII: S0887-6177(02)00160-9

522 C.S. Gass, C. Gonzalez / Archives of Clinical Neuropsychology 18 (2003) 521 527 Table 1 Product moment correlations between 180-item prorated raw scores and full-scale raw scores on the basic MMPI-2 scales Current sample Brain-injured Restandardization Psychiatric sample Scale (N = 186) sample (N = 205) sample (N = 2,600) (N = 632) L 95 93 94 94 F 91 90 86 95 K 93 88 89 91 Hs 97 97 98 99 D 95 96 93 96 Hy 96 97 93 96 Pd 90 87 90 93 Mf 80 77 89 82 Pa 87 88 78 90 Pt 89 88 89 94 Sc 91 90 91 95 Ma 81 83 88 88 Si 86 82 85 86 Note. Decimals omitted from correlations. MMPI-2: Minnesota Multiphasic Personality Inventory-2. circumstances. Such circumstances primarily involve an examinee s failure to complete the entire MMPI-2, or even the first 370 items that are sufficient for scoring the basic validity and clinical scales. As long as the first 180 items are completed, the clinician can use Table 1 in their article to prorate the obtained raw scores and then determine their corresponding T scores based on the MMPI-2 normative data. The authors emphasized that although administration of the complete or abbreviated (370-item) versions of the MMPI-2 is preferable, a shortened version might provide clinically useful information in circumstances where the standard version cannot be administered. Validity data on the 180-item version (MMPI-2-180) were provided by Dahlstrom and Archer (2000) for clinicians to use as a benchmark for what descriptive information may be retained, and what may be lost, when the shortened version of the MMPI-2 is used in place of a full administration (p. 136). Using the protocols of the 2,600 men and women in the MMPI-2 restandardization sample, they examined the accuracy of short-form scores as predictors of scores on the complete MMPI-2 basic profile. High correlations between shortand full-form scores were obtained, ranging from.78 (scale 6) to.94 (L scale). These positive findings were cross-validated on a sample of 632 psychiatric inpatients (r s between.82 and.99). In addition to demonstrating a strong linear relationship between scores on shortened and complete forms, they found very similar composite (mean) profiles. That is, the mean differences between prorated and full-scale scores were small, within three raw score points across the basic validity and clinical scales in both the normative and psychiatric samples. Collectively, the two-fold findings of strong linear relationships and accurate mean score predictions appeared to support the validity of the 180-item version of the MMPI-2. However, a closer investigation of individual protocols casts doubt on this conclusion. Code type interpretation based on the shortened version is problematic, as evidenced by poor congruence in two-point codes (only one-third) and peak scores (one-half)

C.S. Gass, C. Gonzalez / Archives of Clinical Neuropsychology 18 (2003) 521 527 523 (Dahlstrom & Archer, 2000). Similar results were reported by Gass and Luis (2001) using a sample of 205 brain-injured patients. Such disparity in code types and peak scores should discourage clinicians who use the MMPI-2-180 from engaging in configural interpretation of the basic clinical profile. This frequent lack of congruence within individual cases suggests a degree of inaccuracy in short-form prediction that is not evident when only mean scores are compared. Short-form results might appear to be accurate when scores are averaged over many individuals. However, the clinician dealing with individual cases needs to know the probability that, for any given individual, a short-form score will equal or approximate the score obtained from a full administration (Butcher, Kendall, & Hoffman, 1980). Unless this probability is reasonably high, the problem of short-form interpretation extends well beyond code types and peak scores, and into the broader realm of interpreting scores on the individual scales. The frequency of accurate score prediction using the MMPI-2-180 was examined in a sample of 205 brain-injured patients (Gass & Luis, 2001). Using a 5T-score standard of accuracy (±5T), the results indicated relatively low rates of precision (<60%) on scales F, 3, 4, 5, 6, 7, 8, 9, and 0. Even using a very generous margin of error (±10T), scales 6, 7, and 8 still showed error rates exceeding one-third of cases. In contrast, scores within 5T were obtained in over 80% of cases on the Lie scale and on scale 1, both of which have a large proportion of items within the first 180 items. Overall, the generally poor performance of the 180-item short form might be partly sample specific. This patient sample was comprised of individuals with various types of brain dysfunction, including stroke, traumatic brain damage, and neurodegenerative disease. In the present study we extended the investigation of MMPI-2-180 validity to a predominantly psychiatric sample of patients who had negative neurodiagnostic findings and an absence of any history of brain dysfunction. 1. Methods Participants in this study were 186 patients at the Miami Veterans Affairs Medical Center (VAMC). These were 167 male and 19 female veterans who were referred for neuropsychological evaluation by various services, including Psychiatry, Primary Care, and Neurology. The mean age of the sample was 52 years (S.D. = 14.3, range = 19 81). Their average level of education was 12.1 years (S.D. = 2.3), and Full Scale IQ on the WAIS-R or WAIS-III was 99 (S.D. = 13.3). All of these subjects underwent a structured interview and all denied any history of stroke, concussion, anoxic events, tumor, focal neurological symptoms, encephalitis, brain surgery or other neurological conditions. In addition, they had negative diagnostic findings based on CT scan or MRI, laboratory studies, and, in many cases, neurological examination. Diagnostic conditions included depressive disorders (26%), somatoform disorders (20%), PTSD (10%), adjustment disorders (10%), other anxiety disorders (8%), and other (26%). All participants were administered the standardized audiotaped MMPI-2 (Butcher et al., 1989) produced by the University of Minnesota Press as part of a comprehensive evaluation. MMPI-2 profiles were excluded from the study if they satisfied any of the following invalidity criteria: Cannot Say > 10, F 120T, VRIN > 80 or TRIN > 80. MMPI-2 protocols were

524 C.S. Gass, C. Gonzalez / Archives of Clinical Neuropsychology 18 (2003) 521 527 analyzed using a computerized scoring system and re-scored using an algorithm based on the 180-item short form (Dahlstrom & Archer, 2000). 2. Results The accuracy of the MMPI-2-180 was investigated using several statistical procedures. First, Pearson product moment correlations were computed between the prorated raw scores based on MMPI-2-180 and the full-scale raw scores. The results are shown in Table 1 adjacent to the previous findings reported by Gass and Luis (2001) and Dahlstrom and Archer (2000) based on their normative and psychiatric samples. As in previous studies, short- and full-version score correlations were high. The positive correlational findings, while revealing a strong linear relationship, do not directly address the possibility of underestimating or overestimating full-scale scores. Following the method of Dahlstrom and Archer (2000), this issue was examined by measuring the mean difference between the predicted and obtained MMPI-2 scores. As shown in Table 2, the MMPI-2-180 overestimated the T-scores on scales 8 and 0 by an average of eight and four points, respectively, while underestimating scores on scales 3 and 6 by an average of five and four T-score points, respectively. Also reported in Table 2 are the standard deviations which, like the mean scores, directly reflect the precision of score estimation. For example, in this sample, it appears that the MMPI-2-180 overestimates scale 8 by 8 T-score points on average, but the standard deviation (about 10) suggests that about 68% of the protocols will have scores ranging from a 2-point underestimate to an 18-point overestimate. To the extent that these dif- Table 2 Mean T-score differences and standard deviations based on full length versus 180-item MMPI-2 administrations in the neuropsychological samples Current sample Brain-injured sample a (N = 205) Scale M S.D. M S.D. L 1.19 4.11 1.11 4.33 F 2.29 8.50 1.70 8.34 K 1.45 4.36 0.84 4.89 Hs 1.97 3.61 2.25 3.62 D 0.98 5.02 1.75 4.89 Hy 4.74 5.35 5.07 4.68 Pd 0.13 6.74 1.03 6.76 Mf 3.54 7.39 3.08 6.10 Pa 4.08 10.34 5.03 8.74 Pt 3.02 9.01 2.67 9.40 Sc 8.33 10.11 5.99 9.94 Ma 3.09 8.42 1.01 7.67 Si 4.31 7.45 5.18 7.26 Note. A positive number indicates an overestimate of the full MMPI-2 scale score. a From Gass and Luis (2001).

C.S. Gass, C. Gonzalez / Archives of Clinical Neuropsychology 18 (2003) 521 527 525 ferences are normally distributed, we can surmise that about 32% of the sample had estimated scores that were beyond this range. An inspection of the standard deviations in Table 2 reveals greater precision of estimation using scales 1, 2, 3, L and K, and less precision with scales 6, 7, and 8. Code type congruence was examined. A comparative analysis of profile configurations revealed complete congruence between the two versions in 22% (40/186) of the cases. With the inclusion of profiles that showed partial congruence (i.e., protocols in which a third scale had a score equivalent to the second scale of the original code type), 32% (60/186) of the cases showed agreement. This is similar to previously reported findings. Dahlstrom and Archer also reported that the high-point (or peak) scale was equivalent in half of the cases. Our data revealed peak scale congruence in 54% of cases (100/186). Gass and Luis (2001) reported equivalent peak scale results in 44% (91/205) of their brain-injury sample. From a practical standpoint, it is worthwhile to know the likelihood that a given short-form score will predict the score on a completed MMPI-2 scale within 5 or 10 T-score points. An analysis of this sample s frequency of estimation accuracy provides the clinician with a basis for determining the likelihood of accurate prediction using the MMPI-2-180. We initially examined the number of short-form protocols that predicted the full-scale T-score within plus-or-minus five points. The same analysis was used to identify the probability of estimation within ±10 T-score points. The results, shown in Table 3 with previous findings, indicate substantial variability across the basic scales in estimation precision. For example, accuracy within a margin of ±5T occurred on scale 1 in 81% of the sample, but on scale 8 in only 31% of the subjects. As Table 3 indicates, these results are similar to the findings obtained using a brain-injury sample. Table 3 Frequency (percentage) of estimations within 5 and 10 T-score points of full MMPI-2 scale score Short-form accuracy within 5T Short-form accuracy within 10T Current Brain-injured sample Current Brain-injured sample Scale sample (N = 205) a sample (N = 205) a L 91 91 99 99 F 46 44 77 77 K 73 70 98 97 Hs 81 81 97 97 D 73 70 95 94 Hy 54 56 88 90 Pd 53 52 91 87 Mf 53 51 85 87 Pa 35 47 58 64 Pt 46 38 75 64 Sc 31 34 53 62 Ma 42 53 77 83 Si 50 44 76 77 Note. Decimals omitted from correlations. a From Gass and Luis (2001).

526 C.S. Gass, C. Gonzalez / Archives of Clinical Neuropsychology 18 (2003) 521 527 Table 4 Accuracy in predicting clinically significant elevations (T 65) on the basic scales Scale Accuracy (%) False-positive (%) False-negative (%) L 88 2 10 F 88 10 2 K 97 2 1 Hs 91 1 8 D 91 1 8 Hy 88 1 11 Pd 89 6 5 Mf 94 3 3 Pa 86 3 11 Pt 87 8 5 Sc 92 7 1 Ma 78 18 4 Si 82 16 2 In brain-injured individuals, the 180-item short form was accurate, not in score prediction, but in predicting the presence or absence of clinically significant elevations (T 65) on the basic scales. For example, on the short form a score exceeding 65T on scale 8 was predictive of a score of 65T or greater on scale 8 in the full MMPI-2 version. A low short-form score similarly predicted a low score (T <65) on the full-scale counterpart on the standard MMPI-2. We tested this approach using the current sample. The results of this analysis (Table 4) replicated the previous findings that were obtained using a brain-injured sample. On any given scale on the MMPI-2-180, a high score is strongly predictive of a clinically significant elevation on the full version of that scale. 3. Discussion Shortened versions of tests are desirable if they are sufficiently reliable and valid for providing useful clinical information. Unless a short form has its own established behavioral correlates, its value should be determined, in part, on the basis of its likelihood of accurate full-form score prediction for individual examinees. Reliance on correlational analyses, including regression equations (Lalone, Axelrod, & Schinka, 2000), between short- and full-form scores is insufficient because it overlooks the degree of absolute score agreement. Disparate scores can be highly correlated, as was the case in the present sample with scale 8. Further appeal to mean score agreement across individuals on the short and full form is also insufficient for establishing validity, because this criterion neglects predictive error in individual cases. Average short- and complete-form scores have little relevance for any particular individual s short- and complete-form scores. Mean scores have little value in this context because inaccurate protocols, when averaged over many individuals, can yield identical mean scores. Unless research has established extratest behavioral correlates for a short form, the frequency of accurate individual full-form score prediction is essential for determining short-form validity.

C.S. Gass, C. Gonzalez / Archives of Clinical Neuropsychology 18 (2003) 521 527 527 Despite its advantage as an easier test to administer and as a time saver, the MMPI-2-180 appears to be a poor substitute for the standard or abbreviated (370-item) MMPI-2. This conclusion appears to apply with neurologically intact individuals, as well as in relation to brain-impaired examinees. The shortened version does not provide a sound basis for interpreting profile code types or other aspects of the profile configuration that are customarily used. The predictive validity of the shortened scales varies considerably when cases are examined individually. The prorated scores provide a good estimation of the full-form scores on scales L and 1. In contrast, estimated scores on most scales are likely to miss by more than 5T in about half of the cases. Scales 6 and 8 are very unreliable, as the clinician has about a 50:50 chance of obtaining a score that is more than 10 T-score points in error. However, the modest goal of predicting within a T-score range of 10 points appears to be attainable on the neurotic triad (1, 2, and 3) and on scales 4, 5, and 9. Thus, protocols with extremely high scores on these scales might provide a basis for some clinical inferences. Unfortunately, prediction within a 10-point range involves a degree of imprecision that, in most cases, renders accurate configural-based profile interpretation impossible. Using this liberal criterion, any two scales on the short-form that have identical scores could be as much as 20 points apart on the full-scale MMPI-2 version. The MMPI-2-180 might be useful as a means of salvaging a very limited amount of information from an unfinished MMPI-2 administration. In the event that an MMPI-2 protocol cannot be completed through item 370, scores on the 180-item version can be used to estimate whether or not a full-scale score would fall above or below 65T. This information might be useful in alerting a clinician to possible problem areas. However, in the absence of the actual score or reliable information regarding its position relative to scores on other scales, the inferences based on this procedure are tentative. The unavailability of subscales to help guide interpretation is a further limitation of the short form. Although most of the existing data suggest that the MMPI-2-180 has unreliable linkage to the MMPI-2, psychometric studies could address the MMPI-2-180 in its own right. Little is known regarding the temporal stability of its abbreviated scales or what they actually measure (if anything). There is presently no empirical justification for routine use of the 180-item short form. If the preferred test must be shorter than the MMPI-2 (370-item abbreviated format), the clinician should choose from among other tests that have established psychometric credentials. References Butcher, J. N., Kendall, P. C., & Hoffman, N. (1980). MMPI short forms: Caution. Journal of Consulting and Clinical Psychology, 48, 275 278. Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A., & Kaemmer, B. (1989). Minnesota Multiphasic Personality Inventory-2 (MMPI-2): Manual for administration and scoring. Minneapolis, MN: University of Minnesota Press. Dahlstrom, W. G., & Archer, R. P. (2000). A shortened version of the MMPI-2. Assessment, 7, 131 137. Gass, C. S., & Luis, C. A. (2001). MMPI-2 short form: Psychometric characteristics in a neuropsychological setting. Assessment, 8, 213 219. Lalone, L. V., Axelrod, B. N., & Schinka, J. A. (2000). Prediction of MMPI-2 clinical scales for incomplete protocols. Paper presented at the 108th Annual Convention of the American Psychological Association.