Brief Report: Interrater Reliability of Clinical Diagnosis and DSM-IV Criteria for Autistic Disorder: Results of the DSM-IV Autism Field Trial

Similar documents
Autism Diagnostic Observation Schedule Second Edition (ADOS-2)

The use of Autism Mental Status Exam in an Italian sample. A brief report

Diagnosis Advancements. Licensee OAPL (UK) Creative Commons Attribution License (CC-BY) Research study

Depression in Children with Autism/Pervasive Developmental Disorders: A Case-Control Family History Study

WHAT IS AUTISM? Chapter One

1/30/2018. Adaptive Behavior Profiles in Autism Spectrum Disorders. Disclosures. Learning Objectives

The Vineland Adaptive Behavior Scales: Supplementary Norms for Individuals with Autism

Adaptive Behavior Profiles in Autism Spectrum Disorders

The Nuts and Bolts of Diagnosing Autism Spectrum Disorders In Young Children. Overview

Diagnostic Interview for Social and Communication Disorders

Perceived Suitability and Usefulness of DSM-III vs. DSM-II in Child Psychopathology

Applied Behavior Analysis for Autism Spectrum Disorders

MEDICAL POLICY SUBJECT: APPLIED BEHAVIOR ANALYSIS FOR THE TREATMENT OF AUTISM SPECTRUM DISORDERS

University of Groningen. The Friesland study Bildt, Alida Anna de

Comparison of the Null Distributions of

AUTISM SPECTRUM DISORDER: DSM-5 DIAGNOSTIC CRITERIA. Lisa Joseph, Ph.D.

Analysis of new diagnostic criteria for autism sparks debate

Table 1: Comparison of DSM-5 and DSM-IV-TR Diagnostic Criteria. Autism Spectrum Disorder (ASD) Pervasive Developmental Disorders Key Differences

Assessment of Interrater Agreement for Multiple Nominal Responses Among Several Raters Chul W. Ahn, City of Hope National Medical Center

MEDICAL POLICY SUBJECT: APPLIED BEHAVIOR ANALYSIS FOR THE TREATMENT OF AUTISM SPECTRUM DISORDERS

Research Article The Michigan Autism Spectrum Questionnaire: A Rating Scale for High-Functioning Autism Spectrum Disorders

Prevalence of Autism Spectrum Disorders --- Autism and Developmental Disabilities Monitoring Network, United States, 2006

Differential Diagnosis. Not a Cookbook. Diagnostic Myths. Starting Points. Starting Points

BEHAVIOR ANALYSIS FOR THE TREATMENT OF AUTISM SPECTRUM DISORDERS

ASHA Comments* (ASHA Recommendations Compared to DSM-5 Criteria) Austism Spectrum Disorder (ASD)

What s in a name? Autism is a Syndrome. Autism Spectrum Disorders 6/30/2011. Autism Spectrum Disorder (ASD) vs Pervasive Developmental Disorder (PDD)

DSM-5 Autism Criteria Applied to Toddlers with DSM-IV-TR Autism

Demystifying DSM 5 Diagnosis

No An act relating to health insurance coverage for early childhood developmental disorders, including autism spectrum disorders. (S.

DISABILITY IN PERVASIVE DEVELOPMENTAL DISORDERS: A COMPARATIVE STUDY WITH MENTAL RETARDATION IN INDIA

Handbook Of Autism And Pervasive Developmental Disorders Assessment Interventions And Policy

Applied Behavior Analysis Therapy for Treatment of Autism Spectrum Disorder

Dr. Pushpal Desarkar & Dr. Anna M. Palucka. Presentation objectives / overview. Why do we use the DSM?

Autism Diagnosis as a Social Process

10/18/2016. Vineland Adaptive Behavior Scales, Third Edition 1. Meet Dr. Saulnier. Bio. Celine A. Saulnier, PhD Vineland-3 Author

University of Groningen. The Friesland study de Bildt, Annelies

This is a pre-publication version of the article published in the Journal of Clinical Practice in Speech Language Pathology

5. Diagnostic Criteria

The Autism Diagnostic Observation Schedule, Module 4 de Bildt, Annelies; Sytema, Sjoerd; Meffert, Harma; Bastiaansen, Jojanneke

Facilitating the identification of autism spectrum disorders in school- age children

Identifying students with autism spectrum disorders: A review of selected screening tools

11/22/10. The best tool is a trained and experienced examiner. Which is the Best Tool for Evaluating ASD?

Applied Behavior Analysis Therapy for Treatment of Autism Spectrum Disorder

Fact Sheet 8. DSM-5 and Autism Spectrum Disorder

References to Relevant Papers. ADOS Standardisation / Psychometrics. BeginningwithA

Cover Page. The handle holds various files of this Leiden University dissertation.

ABAS-II Ratings and Correlates of Adaptive Behavior in Children with HFASDs

! Introduction:! ! Prosodic abilities!! Prosody and Autism! !! Developmental profile of prosodic abilities for Portuguese speakers!

The Diagnostic Interview for Social and Communication Disorders: algorithms for ICD-10 childhood autism and Wing and Gould autistic spectrum disorder

Cognitive and symptom profiles in Asperger s syndrome and high-functioning autism

Editorial: DSM-5 and autism spectrum disorders two decades of perspectives from the JCPP

72 participants 60% 50% % Participants 40% 30% 20% 10% 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% % Participants

First Concerns. Wh at if I (o r t h e pa r e n t s) h av e c o n c e r n s a b o u t a pat i e n t? 10 Toolkit for Medical Professionals

COMPARATIVE STUDY OF EARLY CHILDHOOD HIGH- RECEPTIVE-EXPRESSIVE LANGUAGE DISORDER FUNCTION AUTISM AND DEVELOPMENTAL MIXED

The Pennsylvania State University. The Graduate School. College of Education VALIDITY AND DIAGNOSTIC ACCURACY OF SCORES FROM THE AUTISM

Long-term outcome of social skills intervention based on interactive LEGO play

Supplementary Information. Enhancing studies of the connectome in autism using the Autism Brain Imaging Data Exchange II

Myths! Myths and Realities of Evaluation, Identification, and Diagnosis of ASD 10/11/10. Facts. The Ziggurat Group

A Longitudinal Pilot Study of Behavioral Abnormalities in Children with Autism

Estimates of the Reliability and Criterion Validity of the Adolescent SASSI-A2

Deconstructing the DSM-5 By Jason H. King

COGNITIVE-BEHAVIORAL GROUP TREATMENT FOR ANXIETY SYMPTOMS IN CHILDREN WITH HIGH-FUNCTIONING AUTISM SPECTRUM DISORDERS. Judy Reaven and Susan Hepburn

Age of diagnosis for Autism Spectrum Disorders. Reasons for a later diagnosis: Earlier identification = Earlier intervention

A Reliability Study for Transcription

Critical Review: Using Video Modelling to Teach Verbal Social Communication Skills to Children with Autism Spectrum Disorder

Behavioral and Early Intervention Reviews/Research

University of Huddersfield Repository

Autism rates in the United States explained

MEDICAL POLICY Children's Intensive Behavioral Service/ Applied Behavioral Analysis (ABA)

Melissa Heydon M.Cl.Sc. (Speech-Language Pathology) Candidate University of Western Ontario: School of Communication Sciences and Disorders

Research Article Defining Autism: Variability in State Education Agency Definitions of and Evaluations for Autism Spectrum Disorders

Comparing social skills in children with Autistic disorder and Pervasive Developmental Disorder Not Otherwise Specified

Early Childhood Measurement and Evaluation Tool Review

University of Groningen. The Friesland study Bildt, Alida Anna de

In 1943, Leo Kanner first described autism. This perplexing and highly

Factors Influencing How Parents Report. Autism Symptoms on the ADI-R

Unequal Numbers of Judges per Subject

DSM-IV Criteria. (1) qualitative impairment in social interaction, as manifested by at least two of the following:

Correspondence of Pediatric Inpatient Behavior Scale (PIBS) Scores with DSM Diagnosis and Problem Severity Ratings in a Referred Pediatric Sample

APPENDIX 11: CASE IDENTIFICATION STUDY CHARACTERISTICS AND RISK OF BIAS TABLES

The Action Is In the Interaction

Assessment and Diagnosis

Autism. Laura Schreibman HDP1 11/29/07 MAIN DIAGNOSTIC FEATURES OF AUTISTIC DISORDER. Deficits in social attachment and behavior

Chapter Two. Classification and treatment plans

Background on the issue Previous study with adolescents and adults: Current NIH R03 study examining ADI-R for Spanish speaking Latinos

Autism and Related Disorders:

The concept of Development

Creation and Use of the Pervasive Developmental Disorder Behavior Inventory (PDDBI) Parent Form

Update on the Reliability of Diagnosis in Older Psychiatric Outpatients Using the Structured Clinical Interview for DSM IIIR

University of Groningen. Autism spectrum disorders Lang, Natasja Desirée Julia van

Evidence presented during the hearing fails to establish an eligible diagnosis for the MR/DD Waiver Program.

There is an autism epidemic. Autism can be cured Autism is the result of cold and unemotional parents. Individuals with autism always have hidden or

Daily living skills in individuals with autism spectrum disorder from 2 to 21 years of age

Which assessment tool is most useful to diagnose adult autism spectrum disorder?

Autism and Other Autism Spectrum Disorders (ASD) or Pervasive Developmental Disorders (PDD)

Reliability. Internal Reliability

Chapter 3. Psychometric Properties

Transcription:

Journal of Autism and Developmental Disorders, Vol. 30, No. 2, 2000 Brief Report: Interrater Reliability of Clinical Diagnosis and DSM-IV Criteria for Autistic Disorder: Results of the DSM-IV Autism Field Trial Ami Klin, 1,2 Jason Lang, 1 Domenic V. Cicchetti, 1 and Fred R. Volkmar 1 INTRODUCTION One of the most important goals of diagnostic classification systems such as the Diagnostic and Statistical Manual of Mental Disorders, 4th ed. (American Psychiatric Association [APA] 1994) is to enhance the agreement on a specific diagnosis among clinicians with diverse backgrounds and levels of experience. Although historically autism has been one of the most reliably diagnosed disorders in child psychiatry (Mattison, Cantwell, Russell, & Will, 1979), some aspects of this, and related pervasive developmental disorders (PDD), present challenges for diagnosis, particularly among less experienced clinicians, for example, there is a broad range of syndrome expression in terms of level of intellectual and communicative functioning, and symptoms change somewhat both as a function of age and developmental level (Lord, Pickles, McLennan, Rutter, et al., 1997; Volkmar, Klin, & Cohen, 1997). Knowledge of and experience in appreciating the manifestations of autism at different levels of developmental abilities is central in the diagnostic process (Rutter, 1978). An additional, and more recent, complexity is the addition of other explicitly defined categories to the PDD class of conditions. Before DSM-IV there were only two categories under the PDD class of disorders (autism and the residual category Pervasive Developmental Disorder Not Otherwise Specified; PDDNOS). DSM-IV now recognizes three additional disorders: Rett Disorder, Childhood Disintegrative Disorder, and Asperger 1 Yale Child Study Center, New Haven, Connecticut. 2 Address all, Correspondence to Ami Klin, Yale Child Study Center, 230 South Frontage Road, New Haven Connecticut 06520, e-mail: Ami.Klin@Yale.Edu 163 Disorder, each of which must be differentiated from autism (Volkmar et al., 1994). Finally, two other complexities should be noted. It has been increasingly recognized both in clinical practice (Klin et al., 1997) and in recent epidemiological studies (Fombonne, 1998), that many children are now identified who have an autistic-like condition but do not present the classic syndrome of autism. In addition, as awareness of autism and related conditions has increased, more children who are cognitively higher functioning have been identified (Klin et al., 1997). Despite advances in the neuroscience of autism, there are no biological markers in the identification of the disorder. Consequently, the diagnostic process is still based on developmental history and behavioral observations made by clinicians. Although there are a number of excellent diagnostic instruments for the diagnosis of autism (e.g., Lord, Rutter, & Dihavore, 1996; Lord, Rutter, & Le Couteur, 1994), the most reliable ones require specialized intensive training, and none substitutes for clinical expertise and experience (Lord et al., 1997). And whether or not one utilizes such diagnostic instruments, the diagnostic assignment still depends on the adoption of consensual definitions as operationalized by DSM-IV or the international equivalent, International Classification of Diseases, Tenth Revision (ICD-10; World Health Organization [WHO] 1993). The two systems are now conceptually identical (Volkmar et al., 1994). To what extent the adoption of these systems yields reliable diagnoses needs to be, therefore, empirically examined. Within this context, while the gold standard for the diagnosis of autism is the best clinical judgment of experienced clinicians (Spitzer & Williams, 1988), diagnostic assignments are made in the larger clinical 0162-3257/00/0400-0163$18.00/0 2000 Plenum Publishing Corporation

164 Klin, Lang, Cicchetti, and Volkmar community probably as often as by the smaller number of autism experts. Hence the importance of expanding the empirical verification of reliability of DSM-IV into the community of less experienced clinicians. In the very small number of studies that systematically examined diagnostic reliability among clinicians with different levels of experience (e.g., Goodman & Simonoff, 1991) or professional training (Perry, Veleno, & Factor, 1998), acceptable levels of agreement were reported, although none of these studies has specifically addressed reliability of diagnostic assignment based on the DSM-IV definition of autism among both experienced and inexperienced clinicians. The present study uses diagnostic data collected during the DSM-IV Autism Field Trial (Volkmar et al., 1994) to answer four questions related to the issues outlined above: (a) What is the interrater reliability of clinician-assigned diagnosis of autism (i.e., without the use of DSM-IV criteria)? (b) What is the interrater reliability for the various DSM-IV criteria for autistic disorder? (c) What is the interrater reliability of DSM-IVassigned diagnosis of autism (i.e., when clinicians rate each diagnostic criterion and the diagnosis of autism is assigned dependent on whether or not the algorithm for autism is met)? and (d) How do these two diagnostic strategies compare? These four questions were examined in the context of comparisons between clinicians with more and less experience. METHOD The DSM-IV Field Trial The DSM-IV Autism Field Trial was a collaborative project involving 13 sites in North America, 4 sites in Europe, and 4 sites in the Middle East, Asia, and Oceania (Volkmar et al., 1994). These sites provided diagnostic ratings on consecutive cases of individuals with either autism or another developmental disorder that would reasonably include autism in the differential diagnosis. The study involved 977 rated cases with clinical (i.e., clinician-assigned) diagnoses of autism (n = 454), other (nonautistic) pervasive developmental disorders (PDD) (n = 240), and non-pdd (e.g., primary diagnoses of language disorders, mental retardation; n = 283). The goal of the Field Trial was to empirically derive the definition of autism for DSM-IV based on a series of analyses including reliability and validity considerations. Previous reports describe several aspects of the Field Trial in greater detail (Buitelaar, Van der Gaag, Klin, & Volkmar, 1999; Volkmar et al., 1994; Volkmar & Rutter, 1995). Participants Of the entire sample of 977 participants, 131 cases received diagnostic ratings by at least two clinicians for the purpose of assessing interrater reliability. Of the 131 cases, 62% had a clinician-assigned diagnosis of autism, 14% had a diagnosis of a non-autistic PDD, and 24% had a diagnosis of a non-pdd disorder; 71% were male and 29% were female; 42% were below age 5, 41% were between ages 5 and 10, 15% were between ages 10 and 20, and 2% were above age 20; 66% were Caucasian, 15% were of African origin (e.g., African American), 13% were Hispanic, 3% were Asian, whereas the remainder had other race/ethnicity. Eighty-three clinicians rated at least one reliability case: 36% of these raters were male and 64% were female; 21% were below age 30, 46% were between ages 30 and 40, 25% were between ages 40 and 50, and 8% were above age 50; 53% were psychiatrists or residents in child psychiatry, 34% were psychologists or psychology trainees, and 13% were speech and language pathologists, nurses, social workers, or special educators. Of these clinicians 51% had extensive experience in the assessment and diagnosis of autism (defined as involvement in the assessment and diagnosis of over 25 patients), with the remaining 49% of raters reported having lesser degrees of experience (25% with 10 to 25 cases, and 24% with less than 10 cases). Of the 131 reliability cases, 37% received diagnostic ratings by at least 2 experienced clinicians, whereas 83% had at least one experienced clinician involved. Procedure The overall clinical diagnosis was assigned before, and independently of the clinicians ratings of the various DSM-IV individual criteria for autism; similarly to the DSM-III-R autism field trial (Spitzer & Siegel, 1990), the diagnoses of experienced clinicians served as a first approximation of a diagnostic gold standard. Subsequently, clinicians completed the ratings for each of the potential DSM-IV criteria for autism, which had been developed on the basis of the results of the various literature reviews and data re-analyses that preceded the project (e.g., Szatmari, 1992; Volkmar, Cicchetti, Bregman, & Cohen, 1992). In addition, a standard data coding system was used to provide information on characteristics of patients (e.g., age, IQ, communicative ability, nature and quality of information available, and, at the discretion of the clinician, information on standard tests or assessment instruments). Similarly, standard forms were provided for raters to indicate clinician-assigned

Interrater Reliability for Autistic Disorder 165 diagnosis and level of confidence, ratings for each of the DSM-IV criteria for autism, and the rater s own personal data (e.g., age, gender, experience). Coordinators at each site were provided with a summary of procedures, but no systematic training in application of the potential DSM-IV criteria was provided. Measures were taken to protect patient (and rater) confidentiality, and the research procedures had been approved by the various institutional human investigation committees at the different sites. A series of quality control data entry and management were adopted (e.g., double entry and range checks, data audits). When ratings included missing data, site coordinators were asked to secure the information if possible. Of the entire DSM-IV sample (i.e., 977 cases), only 6 cases had to be excluded because of multiple and major data points missing. be interpreted with caution and are likely to be less stable than values derived from larger samples. As might be expected, the more experienced raters exhibited excellent agreement with the most disagreement over the more fine-grained distinctions between autism and other possible disorders in the PDD class. The same pattern was obtained, with slightly lower levels of agreement, between raters from different professional backgrounds, and, with lower levels still, between pairs of experienced inexperienced raters. It is important to note, however, that agreement was generally quite high. Even when inexperienced raters were evaluated to each other, agreement was reasonably good with the exception, as expected, of agreement in regards to the comparison of autism and other PDD categories where the level of agreement was only fair. RESULTS Interrater Reliability of Clinical Diagnosis Interrater reliability coefficients were obtained for primary clinician-assigned diagnoses. The kappa coefficient (Cohen, 1960) was used as the preferred chancecorrected measure of agreement for the dichotomous data (Fleiss, 1981). Table I lists kappas obtained for agreement between pairs of raters according to clinical experience and professional training and overall levels of agreement on clinical diagnosis. These values are provided for case comparisons between autism and a non-pdd disorder, between autism and other developmental disorders (both non-pdd and nonautistic PDD), and between autism and nonautistic PDDs. Levels of clinical significance are defined as per Cicchetti and Sparrow s (1981) criteria. Values are ranked by kappa to clarify patterns of agreement observed. It should be emphasized that kappas in groups with small ns must Interrater Reliability of DSM-IV Criteria for Autistic Disorder Interrater reliability coefficients were also obtained for the potential DSM-IV criteria for autism. These criteria, which now form the definition of autistic disorder in DSM-IV (APA, 1994), are listed in Table II together with their respective kappa coefficients. Given that kappa is a chance-corrected coefficient, low coefficients may be a function of high chance probability for agreement rather than poor rates of observed agreement. Therefore, percentages of observed agreement (PO) were obtained for those criteria for which kappas were not in the Excellent or Good categories of clinical significance. The clinical significance of PO s was defined as 90 100% Excellent Agreement, 80 89% Good Agreement, 70 79% Fair Agreement, and <70% Poor Agreement. Only criteria with kappa <.60 and PO < 80% were judged to be of poor or suboptimal reliability. Table II lists the DSM-IV criteria for autistic Table I. Interrater Reliability for Clinician-Assigned Diagnoses by Diagnostic Group and Rater Experience/Professional Background a Autism vs. non-pdd Autism vs. other Autism vs. nonautistic PDD No. of Clinical No. of Clinical No. of Clinical Rater Groups k cases significance k cases significance k cases significance Experienced vs. Experienced 1.00 44 E 0.94 48 E 0.85 40 E Psychologist vs. Psychiatrist 1.00 38 E 0.86 45 E 0.67 33 G Inexperienced vs. Inexperienced 1.00 14 E 0.79 19 E 0.41 11 F All reliability raters 0.95 103 E 0.81 131 E 0.65 95 G Experienced vs. Inexperienced 0.89 42 E 0.70 61 G 0.59 43 F a Levels of clinical significance of kappa values were defined as follows (criteria as per Cicchetti & Sparrow, 1981): E = Excellent Agreement (k between 0.75 and 1.00); G = Good Agreement (k between 0.60 and 0.74); F = Fair Agreement (k between 0.40 and 0.59); P = Poor Agreement (k less than 0.40).

166 Klin, Lang, Cicchetti, and Volkmar Table II. Kappas, Percentage of Observed Agreement (PO), and Their Clinical Significance for the DSM-IV Criteria for Autistic Disorder Clinical Clinical Criterion a Kappa significance PO significance 1A 0.73 Good 0.89 Good 1B 0.76 Excellent 0.93 Excellent 1C 0.77 Excellent 0.89 Good 1D 0.74 Good 0.90 Excellent 2A 0.75 Excellent 0.90 Excellent 2B 0.58 Fair 0.83 Good 2C 0.79 Excellent 0.89 Good 2D 0.71 Good 0.90 Excellent 3A 0.77 Excellent 0.88 Good 3B 0.63 Good 0.84 Good 3C 0.69 Good 0.85 Good 3D 0.64 Good 0.82 Good Onset 0.66 Good 0.93 Excellent a Criteria are listed in the order in which they appear in DSM-IV. disorder, and their corresponding kappas and clinical significance, as well as the PO s and their clinical significance. As can be seen in Table II, kappas and PO s were generally in the Good to Excellent range, and none of the criteria had poor reliability as defined above. Interrater Reliability for DSM-IV-Assigned Diagnosis and how it compares with Interrater Reliability for Clinician-Assigned diagnosis Interrater reliability coefficients for DSM-IVassigned diagnosis (i.e., diagnostic assignment is made using DSM-IV criteria) and clinician-assigned diagnosis (i.e., diagnostic assignment is made without the use of DSM-IV criteria) were compared for pairs of experienced experienced raters and inexperienced inexperienced raters. Kappas and PO s are presented in Table III. The interrater reliability coefficients for the pairs of experienced experienced raters fell in the Excellent category of clinical significance in both clinicianassigned and DSM-IV-assigned diagnostic strategies. The level of agreement decreased somewhat with the utilization of DSM-IV criteria (though still in the Excellent range). This may be accounted for the fact that ex- perienced clinicians take into account a broader range of clinical phenomena beyond those captured and defined in the DSM-IV criteria for autistic disorder. In contrast, the interrater reliability coefficients for the pairs of inexperienced inexperienced raters, which fell in the Poor category for clinician-assigned diagnoses, were elevated to the uppermost level of the Fair or even Good categories of clinical significance when these raters utilized the DSM-IV criteria for making diagnostic assignment. Therefore, the utilization of DSM-IV criteria appeared to improve interrater reliability of diagnosis. To further examine the changes in the reliability coefficients obtained in the comparison between the two diagnostic strategies, the statistical significance of the difference between the kappa coefficients was calculated within the rater pairs (Cicchetti & Heavens, 1981; Fleiss & Cicchetti, 1978). While the difference in kappas for the experienced experienced raters was not statistically significant (Z = 1.35, ns), the difference in kappas for the inexperienced inexperienced raters approached statistical significance (Z = 1.70, p =.89; p =.045 for one-tailed test). Although this comparison only bordered on statistical significance, it is of interest that there was a clinically significant improvement in interrater reliability among the inexperienced raters when they utilized the DSM-IV criteria, going, as noted, from Poor to Fair/Good rates of agreement. It is likely, therefore, that in contrast with the experienced raters, the utilization of DSM-IV criteria by inexperienced raters improved their clinical considerations, and in turn, their diagnostic reliability. DISCUSSION This study focuses on issues of interrater reliability in the diagnosis of autistic disorder. Based on reliability analyses of diagnostic data collected on cases rated by two clinicians in the context of the DSM-IV Autism Field Trial, a series of important questions could be clarified. First, it was shown that the interrater reliability among clinicians making a diagnosis of autism and related PDDs without the use of DSM-IV criteria was overall quite high, although agreement decreased Table III. Kappas, Percentage of Observed Agreement (PO), and Their clinical Significance for DSM-IV-Assigned Diagnosis and Clinician-Assigned Diagnosis Raters Diagnostic Strategy Kappa Clinical significance PO Clinical significance Experienced vs. clinician assigned 0.94 Excellent 0.98 Excellent experienced DSM-IV assigned 0.84 Excellent 0.91 Excellent Inexperienced vs. Clinician assigned 0.34 Poor 0.67 Poor inexperienced DSM-IV assigned 0.59 Fair 0.80 Good

Interrater Reliability for Autistic Disorder 167 somewhat when the differential diagnosis involved a comparison between autism and other forms of PDD. Differences in professional background among the raters was of little significance. In contrast, differences in clinical experience had a more marked impact on reliability coefficients, with inexperienced raters showing lower rates of agreement, particularly in regards to comparisons between autism and other forms of PDD. The second question concerns the interrater reliability of the various DSM-IV criteria for autistic disorder. With no exception, the coefficients of agreement for the various criteria fell in the Good to Excellent range of clinical significance. None of the criteria had suboptimal or poor reliability. The final set of questions addressed probably the most important aspect of this study, namely, to what extent the use of DSM-IV criteria improves reliability of diagnosis when compared to a diagnostic process making no use of the criteria (i.e., when the clinician assigned a diagnosis based on overall clinical impressions only). The answer to this question depended on the raters in question. When pairs of experienced raters were involved, there was little difference in reliability coefficients obtained for DSM-IV based and clinicianassigned diagnoses. In fact, clinician-assigned coefficients were a little higher, possibly reflecting the fact that experienced clinicians consider a broader range of information than that captured and defined in the DSM- IV definition when making the diagnosis of autism. In contrast, there was a clinically significant improvement in diagnostic reliability when inexperienced raters used the DSM-IV criteria, suggesting that in their case the use of these criteria was beneficial and clearly superior to their overall clinical judgments. Although the reason why this may have been so is beyond the scope of this study, it is likely that the use of DSM-IV criteria both broadened and structured these clinicians observations and clinical considerations. This is, at any rate, a commonly voiced opinion made by trainees, who can benefit from the structure and guidance provided by DSM-IV criteria. If so, one may say that DSM-IV makes an important contribution to clinical practice. REFERENCES American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (4th ed.). Washington, DC: Author. Buitelaar, J. K., Van der Gaag, R., Klin, A., & Volkmar, F. R. (1999). Exploring the boundaries of pervasive developmental disorder not otherwise specified: Analyses of data from the DSM-IV autistic disorder field trial. Journal of Autism and Developmental Disorders, 29, 33 43. Cicchetti, D. V., & Heavens, R., Jr. (1981). A computer program for determining the significance of the difference between pairs of independently derived values of Kappa or weighted Kappa. Educational and Psychological Measurement, 41, 189 193. Cicchetti, D. V., & Sparrow, S. S. (1981). Developing criteria for establishing inter-rater reliability of specific items in a given inventory. American Journal of Mental Deficiency, 86, 127 137. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and psychological measurement, 20, 37 46. Fleiss, J. (1981). Statistical methods for rates and proportions (2nd ed.). New York: Wiley. Fleiss, J. L., & Cicchetti, D. V. (1978). Inference about weighted Kappa in the non-null case. Applied Psychological Measurement, 2, 113 117. Fombonne, E. (1998). Epidemiological surveys of autism. In F. R. Volkmar (Ed.), Autism and pervasive developmental disorders (pp. 32 63). Cambridge, UK: Cambridge University Press. Goodman, R., & Simonoff, E. (1991). Reliability of clinical ratings by trainee child psychiatrists: A research note. Journal of Child Psychology and Psychiatry, 32, 551 555. Klin, A., Carter, A., Volkmar, F. R., Cohen, D. J., Marans, W. D., Sparrow, S. S. (1997). Assessment issues in children with autism. In D. J. Cohen & F. R. Volkmar (Eds.), Handbook of autism and pervasive developmental disorders (pp. 411 447). New York: Wiley. Lord, C., Pickles, A., McLennan, J., Rutter, M., Bregman, M., Folstein, S., Fombonne, E., Leboyer, M., & Minshew, N. (1997). Diagnosis autism: Analyses of data from the Autism Diagnostic Interview. Journal of Autism and Developmental Disorders, 27, 501 517. Lord, C., Rutter, M., & DiLavore, P. (1996). Autism diagnostic observation schedule Generic (ADOS-G). Unpublished manuscript. University of Chicago, Chicago, IL. Lord, C., Rutter, M., & LeCouteur, A. (1994). Autism Diagnostic Interview - Revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of Autism and Developmental Disorders, 24, 659 85. Mattison, R., Cantwell, D. P., Russell, A. T., & Will, L. (1979). A comparison of DSM-II and DSM-III in the diagnosis of childhood psychiatric disorders: 2. Inter-rater agreement. Archives of General Psychiatry, 36, 1217 1222. Perry, A., Veleno, P., & Factor, D. (1998). Inter-rarer agreement between direct care staff and psychologists for the diagnosis of autism according to DSM-III, DSM-III-R, and DSM-IV. Journal of Developmental Disabilities, 6, 32 43. Rutter, M. (1978). Diagnosis and definition of childhood autism. Journal of Autism and Childhood Schizophrenia, 8, 139 161. Spitzer, R. L., & Siegel, B. (1990). The DSM-III-R field trial of pervasive developmental disorders. Journal of the American Academy of Child and Adolescent Psychiatry, 29, 855 862. Spitzer, R. L., & Williams, J. B. (1988). Having a dream: A research strategy for DSM-IV. Archives of General Psychiatry, 45, 871 4. Stzatmari, P. (1992). A review of the DSM-III-R criteria for autistic disorder. Journal of Autism and Developmental Disorders, 22, 507 524. Volkmar, F. R., Cicchetti, D. V., Bregman, J., & Cohen, D. J. (1992). Developmental aspects of DSM-III-R criteria for autism. Journal of Autism and Developmental Disorders, 22, 657 662. Volkmar, F. R., Klin, A., & Cohen, D. J. (1997). Diagnosis and Classificiation of autism and related conditions: Consensus and Issues. In D. J. Cohen & F. R. Volkmar (Eds.), Handbook of autism and pervasive developmental disorders (2nd ed., pp. 5 40). New York: Wiley. Volkmar, F. R., Klin, A., Siegel, B., Szatmari, P., Lord, C., Campbell, M., Freeman, B. J., Cicchetti, D. V., Rutter, M., Kline, W., Buitelaar, J., Hattab, Y., Fombonne, E., Fuentes, J., Werry, J., Stone, W., Kerbeshian, J., Hoshino, Y., Bregman, J., Loveland, K., Szymanski, L. & Towbin, K. (1994). DSM-IV Autism/pervasive developmental disorder field trial. American Journal of Psychiatry, 151, 1361 1367. Volkmar, F. R., & Rutter, M. (1995). Childhood Disintegrative Disorder: Results of the DSM-IV autism field trial. Journal of the American Academy of Child and Adolescent Psychiatry, 34, 1092 1095. World Health Organization. (1993). International classification of diseases (10th rev. chap. 5. Mental and behavioral disorders (including disorders of psychological development). Diagnostic criteria for research. Geneva: Author.