A longitudinal study of self-assessment accuracy

Similar documents
Caribbean Examinations Council Secondary Education Certificate School Based Assessment Additional Math Project

Should We Care How Long to Publish? Investigating the Correlation between Publishing Delay and Journal Impact Factor 1

5/7/2014. Standard Error. The Sampling Distribution of the Sample Mean. Example: How Much Do Mean Sales Vary From Week to Week?

Technical Assistance Document Algebra I Standard of Learning A.9

Statistical Analysis and Graphing

Measures of Spread: Standard Deviation

Estimation and Confidence Intervals

Practical Basics of Statistical Analysis

Pilot and Exploratory Project Support Grant

CHAPTER 8 ANSWERS. Copyright 2012 Pearson Education, Inc. Publishing as Addison-Wesley

Chapter 21. Recall from previous chapters: Statistical Thinking. Chapter What Is a Confidence Interval? Review: empirical rule

Primary: To assess the change on the subject s quality of life between diagnosis and the first 3 months of treatment.

23.3 Sampling Distributions

Sampling Distributions and Confidence Intervals

DISTRIBUTION AND PROPERTIES OF SPERMATOZOA IN DIFFERENT FRACTIONS OF SPLIT EJACULATES*

Objectives. Sampling Distributions. Overview. Learning Objectives. Statistical Inference. Distribution of Sample Mean. Central Limit Theorem

Pilot and Exploratory Project Support Grant

Plantar Pressure Difference: Decision Criteria of Motor Relearning Feedback Insole for Hemiplegic Patients

Statistics 11 Lecture 18 Sampling Distributions (Chapter 6-2, 6-3) 1. Definitions again

GSK Medicine: Study Number: Title: Rationale: Study Period: Objectives: Indication: Study Investigators/Centers: Research Methods:

Minimum skills required by children to complete healthrelated quality of life instruments for asthma: comparison of measurement properties

PACIFICA M.A. IN COUNSELING PSYCHOLOGY. With Emphasis in Marriage and Family Therapy, Professional Clinical Counseling, and Depth Psychology

Lecture Outline. BIOST 514/517 Biostatistics I / Applied Biostatistics I. Paradigm of Statistics. Inferential Statistic.

Guidance on the use of the Title Consultant Psychologist

Children and adults with Attention-Deficit/Hyperactivity Disorder cannot move to the beat

Chapter 8 Descriptive Statistics

Measurement Variability in Duplex Scan Assessment of Carotid Atherosclerosis

The US population aged 75 years or more has

STATISTICAL ANALYSIS & ASTHMATIC PATIENTS IN SULAIMANIYAH GOVERNORATE IN THE TUBER-CLOSES CENTER

How is the President Doing? Sampling Distribution for the Mean. Now we move toward inference. Bush Approval Ratings, Week of July 7, 2003

The relationship between hypercholesterolemia as a risk factor for stroke and blood viscosity measured using Digital Microcapillary

What are minimal important changes for asthma measures in a clinical trial?

5.1 Description of characteristics of population Bivariate analysis Stratified analysis

Review for Chapter 9

Ovarian Cancer Survival

Bayesian Sequential Estimation of Proportion of Orthopedic Surgery of Type 2 Diabetic Patients Among Different Age Groups A Case Study of Government

Standard deviation The formula for the best estimate of the population standard deviation from a sample is:

Physiatrists have struggled to define the relation

IMPAIRED THEOPHYLLINE CLEARANCE IN PATIENTS WITH COR PULMONALE

Study No.: Title: Rationale: Phase: Study Period: Study Design: Centres: Indication: Treatment: Objectives: Primary Outcome/Efficacy Variable:

1 Barnes D and Lombardo C (2006) A Profile of Older People s Mental Health Services: Report of Service Mapping 2006, Durham University.

Statistics Lecture 13 Sampling Distributions (Chapter 18) fe1. Definitions again

Repeatability of the Glaucoma Hemifield Test in Automated Perimetry

Sec 7.6 Inferences & Conclusions From Data Central Limit Theorem

Quantitative Evaluation of Stress Corrosion Cracking Based on Features of Eddy Current Testing Signals

Concepts Module 7: Comparing Datasets and Comparing a Dataset with a Standard

Appendix C: Concepts in Statistics

Modified Early Warning Score Effect in the ICU Patient Population

Psychophysiological Alterations in Posttraumatic Stress Disorder

The Prevalence of Coronary Artery Calcium Among Diabetic Individuals Without Known Coronary Artery Disease

Certify your stroke care program. Tell your community you re ready when needed.

Basic Requirements. of meeting cow herd production and profitability goals for the beef cattle enterprise.

The better prognosis in secondary infertility is associated with a higher proportion of ovulation disorders*

Introduction. The Journal of Nutrition Methodology and Mathematical Modeling

Reporting Checklist for Nature Neuroscience

Measuring the Ability to Identify One s Own Emotions: The Development and Initial Psychometric Evaluation of a Maximum-Performance Test

Retention in HIV care among a commercially insured population,

Smoking cessation, decline in pulmonary function and total mortality: a 30 year follow up study among the Finnish cohorts of the Seven Countries Study

Using the Scale for the Assessment of

JUST THE MATHS UNIT NUMBER STATISTICS 3 (Measures of dispersion (or scatter)) A.J.Hobson

COMPARISON OF A NEW MICROCRYSTALLINE

Self-Care Management for Patients with Congenital Muscular Torticollis: % Caregivers Independent with Home Exercise Program

6th Grade Physical Education Curriculum Map. Learning Targets Vocabulary Assessme nt. Cur. Standards & Benchmarks- Essential Questions or Unit

GSK Medicine Study Number: Title: Rationale: Study Period: Objectives: Primary Secondary Indication: Study Investigators/Centers: Research Methods

Breast cancer is the most frequent cancer in women

Early Ambulation Reduces the Risk of Venous Thromboembolism After Total Knee Replacement. Introduction/Background. Research Team.

Chapter - 8 BLOOD PRESSURE CONTROL AND DYSLIPIDAEMIA IN PATIENTS ON DIALYSIS

Comparison of speed and accuracy between manual and computer-aided measurements of dental arch and jaw arch lengths in study model casts

Carotid and femoral arterial wall changes and the prevalence of clinical cardiovascular disease

A Supplement to Improved Likelihood Inferences for Weibull Regression Model by Yan Shen and Zhenlin Yang

Center for Alaska Native Center for Alaska Native. from the. Bulletin Bulletin

Somatic cell score genetic parameter estimates of dairy cattle in Portugal using fractional polynomials 1

Information Following Treatment for Patients with Early Breast Cancer. Bradford Teaching Hospitals. NHS Foundation Trust

Supplemental Material can be found at: 9.DC1.html

International Journal of Scientific & Engineering Research, Volume 5, Issue 2, February-2014 ISSN

AND FEEDBACK SEEKING TOWARDS ENTREPRENEURIAL PERFORMANCE OF STUDENTS AT CIPUTRA UNIVERSITY

CURRENT ALCOHOL USE IS ASSOCIATED WITH A REDUCED RISK OF HOT FLASHES IN MIDLIFE WOMEN

INTRODUCTION PLAN:

Self-Reported Reasons Men Decide Not to Participate in Free Prostate Cancer Screening

Evaluation of C-14 Based Radiation Doses from Standard Food Ingestion in Korea

GOALS. Describing Data: Numerical Measures. Why a Numeric Approach? Concepts & Goals. Characteristics of the Mean. Graphic of the Arithmetic Mean

Methodology CHAPTER OUTLINE

Measuring Dispersion

04/11/2014 YES* YES YES. Attitudes = Evaluation. Attitudes = Unique Cognitive Construct. Attitudes Predict Behaviour

Recommendations from the Institute of Medicine and National

Intimate partner violence and HIV in ten sub-saharan African countries: what do the Demographic and Health Surveys tell us?

Ida Leida M.Thaha, Mega Marindrawati Rochka 1, Muh. Syafar 2

Randomised controlled trial of a brief alcohol intervention in a general hospital setting

VAPREVENT SYSTEM. Evidence-based innovation in oral care

Association between Overall Lifestyle Changes and the Incidence of Proteinuria: A Population-based, Cohort Study

Hypertension in patients with diabetes is a well recognized

Previous studies have shown that the agestandardized

Research on the effects of aerobics on promoting the psychological development of students based on SPSS statistical analysis

The English smoking treatment services: one-year outcomes

Comments Table with Responses from Developers

Stability and relative validity of the. Neuromuscular Disease Impact Profile (NMDIP)

Intro to Scientific Analysis (BIO 100) THE t-test. Plant Height (m)

How important is the acute phase in HIV epidemiology?

PDSS: The decision support system of diabetic patient for Public Health

Transcription:

The teachig eviromet A logitudial study of self-assessmet accuracy James T Fitzgerald, Casey B White & Larry D Gruppe Aim Although studies have examied medical studets ability to self-assess their performace, there are few logitudial studies that documet the stability of selfassessmet accuracy over time. This study compares actual ad estimated examiatio performace for three classes durig their first 3 years of medical school. Methods Studets assessed their performace o classroom examiatios ad objective structured cliical examiatio (OSCE) statios. Each self-assessmet was the cotrasted with their actual performace usig idiographic (withi-subject) methods to defie three measures of self-assessmet accuracy: bias (arithmetic differeces of actual ad estimated scores), deviatio (absolute differeces of actual ad estimated scores), ad covariatio (correlatio of actual ad estimated scores). These measures were computed for four itervals over the course of 3 years. Multivariate aalyses of variace ad correlatioal aalyses were used to evaluate the stability of these measures. Results Self-assessmet accuracy measures were relatively stable over the first 2 years of medical school with a decease occurrig i the third year. However, the correlatioal aalyses idicated that the stability of selfassessmet accuracy was comparable to the stability of actual performace over this same period. Coclusio The apparet declie i accuracy i the third year may reflect the trasitio from familiar classroom-based examiatios to the substatially differet cliical examiatio tasks of the third year OSCE. However, the stability of self-assessmet accuracy compares favorably with the stability of actual performace over this period. These results suggest that self-assessmet accuracy is a relatively stable idividual characteristic that may be iflueced by task familiarity. Keywords cliical competece, *stadards; educatio, medical, *stadards, *methods; educatioal measuremet, *stadards; logitudial studies; reproducibility of results; self-cocept. Medical Educatio 2003;37:645 649 Itroductio Accurate, career-log self-assessmet of kowledge ad skills is essetial for physicias to maitai ad improve their medical proficiecy through self-directed educatio. Physicias who caot accurately self-assess their kowledge ad skills may be at greater risk for providig suboptimal care to patiets. The body of research o medical studet selfassessmet is less tha would be expected, give the sigificace of this pheomeo. 1 However, existig studies suggest that there is a developmetal compoet i medical studets ability to evaluate themselves ad peers that lags behid their ability to perform Departmet of Medical Educatio, Uiversity of Michiga Medical School, A Arbor, Michiga, USA Correspodece: James T. Fitzgerald, Uiversity of Michiga Medical School, Departmet of Medical Educatio, Towsley Ceter, Room 1200, Box 0201, A Arbor, Michiga 48109 0201, USA. Tel: +1 734 763 1153, Fax: +1 734 936 1641, E-mail: tfitz@umich.edu specific skills. 2 As suggested by fidigs that the accuracy of studets self-assessmet skills icreases slightly over the course of educatio, 3 self-assessmet ability may be modifiable by educatio. However, eve if self-assessmet is a learable or modifiable skill, it appears likely that much of this learig has take place i childhood ad that by the time studets eter medical school is largely fixed. 4 The limited evidece of improvemet i self-assessmet skills durig medical educatio may reflect the relatively stable character of adult self-assessmet or it may reflect the fact that studets receive little practice i self-assessmet. Sice 1995, we have coducted a series of selfassessmet studies i which we established methods for measurig self-assessmet usig itraidividual aalysis. Itraidividual aalysis eables us to characterize the accuracy of idividual studets, as opposed to a iteridividual aalysis, which produce group-level estimates of accuracy. We have used these measures to address the aalytical problems recetly described by 645

646 Logitudial self-assessmet J T Fitzgerald et al. Key learig poits Practisig physicias eed to assess their kowledge ad skills accurately to maitai their medical proficiecy through self-directed learig. Medical studet self-assessmet accuracy appears to be iflueced by task familiarity; the more familiar the task, the more accurate the selfassessmet. However, medical studet self-assessmet accuracy is reasoably stable over time ad task whe compared with the stability of actual performace, supportig the otio that self-assessmet is a stable characteristic. The results also demostrate the value of a itraidividual methodology (as opposed to a group-level aalysis) for studyig self-assessmet. Ward et al. 5 ad to uderstad better the compoets of medical studet self-assessmet. 6 8 Our studies idicate that self-assessmet accuracy is ot related to demographic (geder ad ethicity) or academic variables (academic performace ad academic preparatio). 9 Some of our prelimiary ivestigatios have also suggested that medical studets self-assessmet abilities are stable over short periods of time 6 ad over task. 10 Our log-term goals have icluded acquirig a better uderstadig of self-assessmet i order to help medical studets grasp its importace to themselves ad to their patiets, to provide them with practice durig medical school ad to develop a itervetio that might assist those with poor self-assessmet abilities. To achieve these goals, it is critical to determie how stable self-assessmet abilities are over time. Uless there is evidece that self-assessmet accuracy is a relatively stable, cosistet characteristic rather tha a purely situatioal pheomeo, there is little poit i cosiderig educatioal itervetios. The focus of this study is to evaluate the temporal stability of medical studets self-assessmet accuracy. By comparig three medical school classes examiatio performace ad self-estimates of this performace, we examied stability of medical studet self-assessmet accuracy from the first year through the third year of medical school. Methods The Uiversity of Michiga Medical School graduatig classes of 1999 ( ¼ 163), 2000 ( ¼ 169) ad 2001 ( ¼ 168) were asked to provide estimates of their performace after completig each examiatio, quiz ad lab examiatio i their M1 witer term, M2 autum term ad M2 witer term. For the class of 1999, 22 self-estimates were obtaied for each studet i the M1 witer term, eight i the M2 autum term ad 17 i the M2 witer term. For the class of 2000, 18 self-estimates were obtaied for each studet i the M1 witer term, 19 i the M2 autum term ad 16 i the M2 witer term. For the class of 2001, 18 self-estimates were obtaied for each studet i the M1 witer term, 18 i the M2 autum term ad 16 i the M2 witer term. Durig these terms, studets were evaluated primarily o cogitive tasks, i.e. multiple-choice quizzes, labs ad examiatios. These self-estimates were provided o the same percetage correct scale used for quatifyig their actual performace (0 100%). At the ed of the third year, after completig their required cliical rotatios, studets took a multiplestatio objective-structured cliical exam (OSCE). They were asked to estimate their performace o each of the statios o a percetage correct scale. There were 10 statios for the class of 1999 ad 13 statios for the two subsequet classes. The OSCE statios were primarily performace-based tasks, i.e. demostratios of cliical skill, ad differed from the classroom-based kowledge assessmet format (predomiately multiplechoice questios) i the first 2 years of medical school. Self-assessmet accuracy was quatified i three idiographic (itraidividual) variables developed i our previous work. The bias idex is the average differece betwee each studet s estimated performace (x e ) ad actual score (x a ) over a series of observatios: P ðxe x a Þ : This idex provides iformatio about the extet to which, o average, studets over- or uderestimated their performace ad by how much. The deviatio idex is calculated as the average absolute deviatio of the estimated score (x e ) from the actual score (x a ) over observatios: P jxe x a j : I cotrast to the bias idex, which allows over- ad uderestimates to cacel out, the deviatio idex summarizes how far a studet s estimates deviate from actual performace. The covariatio idex assesses the correlatio betwee a studet s estimated ad actual performaces over the observatios, i.e. the extet to which variatios i a studet s estimates parallel variatios

Logitudial self-assessmet J T Fitzgerald et al. 647 i actual performace. Note that the covariatio is ot iflueced by differeces betwee the values of the estimated ad actual scores (i.e. bias or deviatio scores). We used the Pearso correlatio coefficiet to quatify covariatio. Statistical methods Geder ad uder-represeted miority distributios for the three classes were examied usig chi square tests. Average medical college admissio test (MCAT) scores for the three classes were examied usig aalysis of variace. I cosideratio of the two differet operatioalizatios of stability, the stability of the three self-assessmet measures was evaluated i two ways. I the first method, stability of studet self-assessmet accuracy over the 3-year time frame was examied by a multivariate aalysis of variace with repeated measures. These aalyses examied the magitude of self-assessmet accuracy over the four itervals. The secod method examied the correlatios of each self-assessmet accuracy measure betwee cosecutive periods. Oly studets who had o-missig values for all the periods were used i the aalyses. Pearso product momet correlatio was used to evaluate the relatioship betwee period pairs. These aalyses examied the extet to which studets with relatively high or low levels of self-assessmet accuracy i oe period were still high or low i the followig period. We treated the data, both aalytically ad coceptually, i a idiographic (idividualized) maer, rather tha a more traditioal omothetic (group-based) maer. I other words, each studet s data (actual ad self-assessed performace) were used to defie the self-assessmet accuracy of that studet. All aalyses were doe o a Ôwithi-subjectÕ rather tha ÔbetweesubjectÕ basis. Group-based outcomes were obtaied by averagig idividual results. For example, rather tha computig 22 correlatio coefficiets betwee actual ad self-assessed performace o the 22 examiatios or tests i the M1 witer term for the 155 studets i the 1999 class, we computed 155 idividual correlatios, oe for each studet over the 22 examiatios. The resultig idividual correlatios idicated the stregth of the covariatio betwee self-assessed performace ad actual score for each studet, rather tha a group-based correlatio that does ot provide idividuatig iformatio. Results Demographics Demographic comparisos of the three classes are preseted i Table 1. There were o statistically sigificat differeces i the percetage of wome, the percetage of miorities ad the average MCAT scores amog the three classes. Repeated measures The multivariate repeated measures aalysis of variace idicated that all three self-assessmet accuracy meas- Table 1 Medical studet demographics Class 1999 ( ¼ 163) Class 2000 ( ¼ 169) Class 2001 ( ¼168) Wome 42% 38% 40% Uder-represeted miority (95% CI) 17% (14Æ1 19Æ9) 17% (14Æ1 9Æ9) 14% (11Æ3 16Æ7) Medical college admissio test score average (95% CI) 10Æ7 (10Æ4 11Æ0) 11Æ1 (10Æ8 11Æ3) 11Æ1 (10Æ9 11Æ3) Table 2 Meas for performace, performace estimates, ad self-assessmet accuracy measures for each assessmet period Self-assessmet measure M1 witer term M2 autum term M2 witer term M3 OSCE Bias (arithmetic differ.) 343 )2Æ8 ()3Æ6 to)2æ1) )2Æ7 ()3Æ3 to)2æ1) )2Æ2 ()2Æ9 to)1æ4) 1Æ6 (0Æ7 2Æ5) Deviatio (absolute differ.) 343 7Æ8 (7Æ2 8Æ3) 7Æ5 (7Æ1 8Æ0) 7Æ8 (7Æ3 8Æ3) 12Æ9 (12Æ5 13Æ4) Actual-est. covariatio 297 0Æ41 (0Æ38 0Æ44) 0Æ37 (0Æ34 0Æ41) 0Æ36 (0Æ32 0Æ40) 0Æ26 (0Æ22 0Æ29) Self-estimates 343 82Æ9 (82Æ2 83Æ5) 86Æ2 (85Æ4 86Æ9) 85Æ8 (85Æ1 86Æ6) 79Æ6 (78Æ9 80Æ3) Actual performace scores 388 85Æ2 (84Æ6 85Æ7) 88Æ3 (87Æ8 88Æ8) 87Æ4 (86Æ9 87Æ9) 77Æ8 (77Æ2 78Æ4)

648 Logitudial self-assessmet J T Fitzgerald et al. Table 3 Correlatios betwee succeedig assessmet periods M1 witer & M2 autum M2 autum & M2 witer M2 witer & M3 OSCE Self-assessmet measure Correlatio (95% CI) Correlatio (95% CI) Correlatio (95% CI) Bias 343 0Æ63 (0Æ56 0Æ69) 0Æ69 (0Æ63 0Æ74) 0Æ42 (0Æ33 0Æ50) Deviatio 343 0Æ46 (0Æ37 0Æ54) 0Æ55 (0Æ47 0Æ62) 0Æ12 (0Æ01 0Æ22) Covariatio 297 0Æ00 ()0Æ11 0Æ11) 0Æ07 ()0Æ04 0Æ18) 0Æ03 ()0Æ08 0Æ14) Self-estimates 343 0Æ81 (0Æ77 0Æ84) 0Æ79 (0Æ75 0Æ83) 0Æ36 (0Æ26 0Æ45) Actual performace scores 388 0Æ60 (0Æ53 0Æ66) 0Æ70 (0Æ65 0Æ75) 0Æ28 (0Æ19 0Æ37) ures, self-estimates ad performace scores chaged over the course of the study (see Table 2). The bias scores were egative (idicatig a uderestimatio of actual performace o average) for the first three periods, but became positive i M3 years. This idicated that, o average, the studets overestimated their performace o the OSCE. The greatest chage for this measure occurred i the M3 years. Chage patters i the deviatio ad the covariatio values were similar to the bias measure. I the first three periods, scores were relatively cosistet, but i the M3 years, the deviatio score icreased from 7Æ8 to 12Æ9 while the mea covariatio score decreased from 0Æ36 to 0Æ26. The same patter of chage also described the actual self-estimates studets provided ad the actual performace, both of which showed a decrease i the M3 years. Correlatio betwee cosecutive periods The correlatios betwee cotiguous periods o the three self-assessmet accuracy measures idicated that the bias ad deviatio measures had a similar patter (Table 3). For both of these measures, the relative stability of studets self-assessmet accuracy was moderately high from oe period to the ext i the first 2 years of medical school, with correlatio values ragig from 0Æ46 to 0Æ69. However, the correlatios betwee the M2 witer ad M3 OSCE periods were substatially lower. The relative stability of the covariatio measure was essetially zero betwee ay cotiguous periods. Like the bias ad deviatio self-assessmet measures, the correlatios betwee cotiguous periods of actual performace i the first 2 years of medical school are moderately high (0Æ60 ad 0Æ70), but dimiish to 0Æ28 durig the trasitio to the cliical cotext. Similarly, the correlatios of the self-estimated scores ad of the studets actual performace decrease i the same patter over this period. Discussio The meas for the performace ad self-assessmet accuracy measures reflect a fairly high level of stability durig the first three assessmet periods. However, whe self-assessmet is required o a differet type of task (the third year OSCE), both studet performace ad self-assessmet scores chage. For the first time, studets overestimated their performace. The icrease i both the deviatio ad covariatio scores suggests that self-assessmet has become less accurate, which i tur suggests that the type of task or task experiece might play a role i makig self-assessmet judgemets. Although the mea values of the self-assessmet accuracy measures chaged over time ad task, aother perspective o the stability of self-assessmet accuracy from oe time period to the ext is reflected i the correlatios betwee assessmet periods. The correlatios betwee the M1 ad M2 periods vary aroud 0Æ65 for the bias measure ad 0Æ50 for the deviatio measure (Table 3), accoutig for approximately 42% ad 25%, respectively, of the variace i scores i the subsequet period. Whe compared with the stability of actual performace betwee the same time periods (correlatios of approximately 0Æ65, accoutig for approximately 42% of the variace), it is apparet that the stability of self-assessmet is similar to the stability of the actual target performace. The lack of a correlatio amog the cosecutive pairs of the covariatio measure likely reflects the relatively low reliability of this measure as calculated for ay give time period. Because this correlatio of a idividual s actual ad estimated scores is based o a relatively small umber of observatios (e.g. 8 22 for a give term ad 10 13 i the M3 OSCE), each studet s value o this measure is ot likely to be very precise or reliable. Thus, the low correlatio betwee cosecutive terms may be atteuated by the low reliability of this measure. If so, this may be evidece that this particular measure of self-assessmet accuracy is ot a very useful

Logitudial self-assessmet J T Fitzgerald et al. 649 idicator of self-assessmet accuracy, i spite of the fact that it quatifies a aspect that is distict from those summarized i the bias ad deviatio measures. The other oteworthy patter i these results is the decremet i correlatio magitude betwee the M2 witer ad M3 OSCE periods. This declie may be the result of the cotrast i tasks reflected i the two periods, such that the classroom-based kowledge assessmets of the M2 years do ot predict performace or self-assessmet accuracy i the cliical performace tasks represeted by the OSCE. Note, however, that the same declie i the magitude of the correlatio over this period occurs for actual performace. I fact, the stability of self-assessmet accuracy (as idicated by the bias ad deviatio measures) is at least as great as actual studet performace. Results of this study idicate that medical studet self-assessmet accuracy is reasoably stable whe compared with the stability of actual performace. There may be multiple explaatios for the declie i self-assessmet accuracy ad actual performace betwee the classroom assessmets of kowledge (i the first 2 years) ad the cliical assessmets of diagostic ad procedural skills (i the OSCE). Oe is task familiarity. Studets who eter medical school have spet years takig paper ad pecil examiatios. Whe the task is oe i which the studets have had limited experiece, self-assessmet accuracy suffers, as does performace. A alterative explaatio may be that self-assessig oe s kowledge (as i the M1 ad M2 assessmets) is a differet process from self-assessig oe s performace (as i the OSCE). It may be that self-assessmet of kowledge requires dimesios ad iformatio that are differet from those required i the self-assessmet of performace. This judgemet process has may dimesios, icludig the degree a idividual uderstads the task requiremets, the accessibility of the targeted competecies to coscious judgemet, the evaluatio of oe s persoal skills ad resources ad past performace o similar tasks. The chagig ature of the tasks ad the correspodig self-assessmet judgemets is cosistet with the lack of stability i actual performace i these tasks. Performace i the first 2 years predicts oly 8% of the variace of performace i the cliical years. The fidig that self-assessmet is a reasoably stable characteristic of medical studets is a prerequisite for further study of this pheomeo. If selfassessmet accuracy had prove to be etirely depedet o task ad situatio, the search for a coceptual model of self-assessmet, pragmatic educatioal itervetios, would become a much more complex edeavour. These results also demostrate the utility of a idiographic, or itraidividual methodology for studyig self-assessmet. The focus o idividual studets ad their peculiar stregths ad weakesses costitutes the ext stage of research i better uderstadig the ature ad operatio of self-assessmet i medical educatio. Ackowledgemets This research was supported, i part, through a grat from the Natioal Board of Medical Examier s Stemmler Fud. Refereces 1 Gordo MJ. A review of the validity ad accuracy of selfassessmets i health professios traiig. Acad Med 1991;66:762 9. 2 Calhou JG, Woolliscroft JO, Hockma EM, Wolf FM, Davis WK. Evaluatig medical studet cliical skill performace: Relatioships amog self, peer, ad expert ratigs. Proceedigs of the 23rd aual Coferece o Research i Medical Educatio. Res Med Educ 1984;23:205 10. 3 Arold L, Willoughby TL, Calkis EV. Self-evaluatio i udergraduate medical educatio: a logitudial perspective. J Med Educ 1985;60:8. 4 Fitzgerald JT, Gruppe LD, White BA, Davis WK. Medical studet self-assessmet abilities: accuracy ad calibratio. Preseted at the Aual Meetig of the America Educatioal Research Associatio, 1997: America Educatioal Research Associatio, Chicago, IL. 5 Ward M, Gruppe L, Regehr G. Measurig self-assessmet: Curret state of the art. Adv Health Sci Educ 2002. 6 Fitzgerald JT, Gruppe LD, White C. The stability of studet self-assessmet accuracy. Preseted at the 37th Aual Coferece o Research i Medical Educatio. Res Med Educ 1998;21. 7 Gruppe LD, Garcia J, Grum CM. Medical studets selfassessmet accuracy i commuicatio skills. Acad Med 1997;72(1 Suppl.):S57 9. 8 Gruppe LD, White C, Fitzgerald JT, Grum CM, Woolliscroft JO. Medical studets self-assessmets ad their allocatios of learig time. Acad Med 2000;75:374 9. 9 Gruppe LD, Baliga S, Fitzgerald JT, White C, Grum CM, Woolliscroft JO, Davis WK. Do persoal characteristics ad backgroud ifluece self-assessmet accuracy? Preseted at the Eighth Ottawa Coferece o Medical Educatio 1998, Philadelphia, PA. 10 Fitzgerald JT, Gruppe LD, White C. The ifluece of task formats o the accuracy of medical studet self-assessmet. Acad Med 2000;75:737 41. Received 24 July 2002; editorial commets to authors 3 October 2002; accepted for publicatio 10 December 2002