THE PSYCHOMETRICS OF PERFORMANCE ASSESSMENTS A PHYSICIAN S PERSPECTIVE

Size: px
Start display at page:

Download "THE PSYCHOMETRICS OF PERFORMANCE ASSESSMENTS A PHYSICIAN S PERSPECTIVE"

Transcription

1 THE PSYCHOMETRICS OF PERFORMANCE ASSESSMENTS A PHYSICIAN S PERSPECTIVE Jeanne M. Sandella, DO

2 Objectives Discuss the reliability of performance assessments why are they used, what are the threats to having a good assessment? Identify the impact of human raters on the psychometrics of performance assessments Define the examinee centered approach for setting standards for performance assessments

3 Why a physician s perspective? y = log e (x/m sa) r 2 yr 2 = log e (x/m sa) e yr2 = x/m sa me yr2 - x sma me rry = x - mas

4 What makes a good assessment? (Kane s framework) 1. Scoring 2. Generalization 3. Extrapolation 4. Interpretation/decision

5 1. Scoring Fair administration Individuals were evaluated accurately Scoring rules consistently applied

6 Examination administration Fair administration

7 Accurate evaluations Scoring Checklists measure an explicit process (H and P) Evidence based development by experts Rubrics measure implicit processes; Holistic Require significant expertise/training

8 Human Raters are they consistently applying rules? Training of raters (physician and SP) Score equating/calibration Quality assurance of raters

9 Human raters

10 2. Generalization Generalization Good sampling of observations Enough samples

11 Generalizaton factors contributing to errors. Heterogeneity of candidates

12 Generalization factors contributing to errors. Number of cases

13 Generalize results adequate sampling Judge 1 Judge 2 Judge 3 Judge 4 Judge 5 Judge Y Case 1 A B C A A A A A Case 2 B C Case 3 B C Case 4 B C Case 5 B C Case X B C

14 3. Extrapolation Observations are relevant to the construct of interest Scores are not unduly influenced by sources of variance not related to the construct being measured

15

16 Interpretation/decision Framework for score interpretation can be supported Categorization is supported

17 Standard setting What do the scores mean? E.g. Pass/Fail

18 Standard Setting Anatomy A Standard setting method B Defining a performance standard C Derive the Cut-point D Finalization of Cut score

19 Standard Setting Method Examinee-Centered Method Panelists make independent judgments of qualified or not qualified performance on the clinical skill of interest by reviewing actual or proxy performance on the examination.

20 Defining a performance standard Not qualified Qualified

21 Deriving a cut score P r o p = Q Score scale

22 Finalization of Cut score NBOME Executive Committee Stakeholder Surveys Triangulation Expert Panelists

23 Factors contributing to classification errors. The cut score Higher cutscore = more false negatives. Lower cutscore = more false positives.

24 Objectives Discuss the reliability of performance assessments why are they used, what are the threats to having a good assessment? Identify the impact of human raters on the psychometrics of performance assessments Define the examinee centered approach for setting standards for performance assessments

25 Selected references 1. Boulet JR, Gimpel JR, Errichetti AM, Meoli FG. Using National Medical Care Survey data to validate examination content on a performance-based clinical skills assessment for osteopathic physicians. The Journal of the American Osteopathic Association 2003;103: Boulet, J., & Swanson, D. (2004). Psychometric challenges of using simulations for high-stakes assessment. In Dunn, W.F. (ed.). Simulators in Critical Care and Beyond. Society of Critical Care Medicine, Des Plaines, IL. 3. Holmboe, E. (2015) Direct Observations of Students Clinical Skills. In Pangaro, LN and McGaghie W.C. (eds) Handbook on Medical Student Evaluation and Assessment. Gegensatz Press, North Syracuse, NY. 4. Boulet, J.R. & McKinley, D.W. (2013). Criteria for good assessment. (Chapter 2). In McGaghie, W.C. (ed.). International Best Practices for Evaluation in the Health Professions. Radcliffe Publishing, London.

26 THANK YOU.

Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations

Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations July 20, 2011 1 Abstract Performance-based assessments are powerful methods for assessing

More information

Introduction. 1.1 Facets of Measurement

Introduction. 1.1 Facets of Measurement 1 Introduction This chapter introduces the basic idea of many-facet Rasch measurement. Three examples of assessment procedures taken from the field of language testing illustrate its context of application.

More information

International assessment of medical students: Should it matter anymore where the school is located?

International assessment of medical students: Should it matter anymore where the school is located? OPEN ACCESS Discussion International assessment of medical students: Should it matter anymore where the school is located? Donald E. Melnick* President and Chief Executive Officer, National Board of Medical

More information

Reliability of oral examinations: Radiation oncology certifying examination

Reliability of oral examinations: Radiation oncology certifying examination Practical Radiation Oncology (2013) 3, 74 78 www.practicalradonc.org Special Article Reliability of oral examinations: Radiation oncology certifying examination June C. Yang PhD, Paul E. Wallner DO, Gary

More information

Validity Arguments for Alternate Assessment Systems

Validity Arguments for Alternate Assessment Systems Validity Arguments for Alternate Assessment Systems Scott Marion, Center for Assessment Reidy Interactive Lecture Series Portsmouth, NH September 25-26, 26, 2008 Marion. Validity Arguments for AA-AAS.

More information

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at

More information

Higher National Unit Specification. General information for centres. Visceral and Dynamic Imaging. Unit code: DW93 34

Higher National Unit Specification. General information for centres. Visceral and Dynamic Imaging. Unit code: DW93 34 Higher National Unit Specification General information for centres Unit title: Visceral and Dynamic Imaging Unit code: DW93 34 Unit purpose: This Unit enables the candidate to continue to acquire knowledge

More information

RESEARCH ARTICLES. Brian E. Clauser, Polina Harik, and Melissa J. Margolis National Board of Medical Examiners

RESEARCH ARTICLES. Brian E. Clauser, Polina Harik, and Melissa J. Margolis National Board of Medical Examiners APPLIED MEASUREMENT IN EDUCATION, 22: 1 21, 2009 Copyright Taylor & Francis Group, LLC ISSN: 0895-7347 print / 1532-4818 online DOI: 10.1080/08957340802558318 HAME 0895-7347 1532-4818 Applied Measurement

More information

This is a repository copy of Investigating disparity between global grades and checklist scores in OSCEs.

This is a repository copy of Investigating disparity between global grades and checklist scores in OSCEs. This is a repository copy of Investigating disparity between global grades and checklist scores in OSCEs. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/82903/ Version: Accepted

More information

SIMULATION-BASED ASSESSMENT OF PARAMEDICS AND PERFORMANCE

SIMULATION-BASED ASSESSMENT OF PARAMEDICS AND PERFORMANCE SIMULATION-BASED ASSESSMENT OF PARAMEDICS AND PERFORMANCE IN REAL CLINICAL CONTEXTS Walter Tavares, ACP, BSc, Vicki R. LeBlanc, PhD, Justin Mausz, ACP, Victor Sun, ACP, BSc, Kevin W Eva, PhD ABSTRACT Objective.

More information

UNIVERSITY OF CALGARY. Reliability & Validity of the. Objective Structured Clinical Examination (OSCE): A Meta-Analysis. Ibrahim Al Ghaithi A THESIS

UNIVERSITY OF CALGARY. Reliability & Validity of the. Objective Structured Clinical Examination (OSCE): A Meta-Analysis. Ibrahim Al Ghaithi A THESIS UNIVERSITY OF CALGARY Reliability & Validity of the Objective Structured Clinical Examination (OSCE): A Meta-Analysis by Ibrahim Al Ghaithi A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL

More information

1 Version SP.A Investigate patterns of association in bivariate data

1 Version SP.A Investigate patterns of association in bivariate data Claim 1: Concepts and Procedures Students can explain and apply mathematical concepts and carry out mathematical procedures with precision and fluency. Content Domain: Statistics and Probability Target

More information

A Comparison of Standard-setting Procedures for an OSCE in Undergraduate Medical Education

A Comparison of Standard-setting Procedures for an OSCE in Undergraduate Medical Education R E S E A R C H R EPORT A Comparison of Standard-setting Procedures for an OSCE in Undergraduate Medical Education David M. Kaufman, EdD, Karen V. Mann, PhD, Arno M. M. Muijtjens, PhD, and Cees P. M. van

More information

Credentialing with Simulation

Credentialing with Simulation Credentialing with Simulation PP Chen COS, Department Anaesthesiology & Operating Services North District Hospital & Alice Ho Miu Ling Nethersole Hospital Director, NTE Simulation & Training Centre Outline

More information

Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews

Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews J Nurs Sci Vol.28 No.4 Oct - Dec 2010 Essential Skills for Evidence-based Practice Understanding and Using Systematic Reviews Jeanne Grace Corresponding author: J Grace E-mail: Jeanne_Grace@urmc.rochester.edu

More information

Validity, Reliability, and Fairness in Music Testing

Validity, Reliability, and Fairness in Music Testing chapter 20 Validity, Reliability, and Fairness in Music Testing Brian C. Wesolowski and Stefanie A. Wind The focus of this chapter is on validity, reliability, and fairness in music testing. A test can

More information

2016 Technical Report National Board Dental Hygiene Examination

2016 Technical Report National Board Dental Hygiene Examination 2016 Technical Report National Board Dental Hygiene Examination 2017 Joint Commission on National Dental Examinations All rights reserved. 211 East Chicago Avenue Chicago, Illinois 60611-2637 800.232.1694

More information

THE ANGOFF METHOD OF STANDARD SETTING

THE ANGOFF METHOD OF STANDARD SETTING THE ANGOFF METHOD OF STANDARD SETTING May 2014 1400 Blair Place, Suite 210, Ottawa ON K1J 9B8 Tel: (613) 237-0241 Fax: (613) 237-6684 www.asinc.ca 1400, place Blair, bureau 210, Ottawa (Ont.) K1J 9B8 Tél.

More information

Item Analysis Explanation

Item Analysis Explanation Item Analysis Explanation The item difficulty is the percentage of candidates who answered the question correctly. The recommended range for item difficulty set forth by CASTLE Worldwide, Inc., is between

More information

Examination of the Application of Item Response Theory to the Angoff Standard Setting Procedure

Examination of the Application of Item Response Theory to the Angoff Standard Setting Procedure University of Massachusetts Amherst ScholarWorks@UMass Amherst Open Access Dissertations 9-2013 Examination of the Application of Item Response Theory to the Angoff Standard Setting Procedure Jerome Cody

More information

Validation of educational assessments: a primer for simulation and beyond

Validation of educational assessments: a primer for simulation and beyond Cook and Hatala Advances in Simulation (2016) 1:31 DOI 10.1186/s41077-016-0033-y METHODOLOGY ARTICLE Validation of educational assessments: a primer for simulation and beyond David A. Cook 1,2,3* and Rose

More information

Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models

Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Jin Gong University of Iowa June, 2012 1 Background The Medical Council of

More information

Presented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College

Presented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College Presented By: Yip, C.K., OT, PhD. School of Medical and Health Sciences, Tung Wah College Background of problem in assessment for elderly Key feature of CCAS Structural Framework of CCAS Methodology Result

More information

Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination

Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination Educational Research Article Assessing the reliability of the borderline regression method as a standard setting procedure for objective structured clinical examination Sara Mortaz Hejri 1, Mohammad Jalili

More information

Driving Success: Setting Cut Scores for Examination Programs with Diverse Item Types. Beth Kalinowski, MBA, Prometric Manny Straehle, PhD, GIAC

Driving Success: Setting Cut Scores for Examination Programs with Diverse Item Types. Beth Kalinowski, MBA, Prometric Manny Straehle, PhD, GIAC Driving Success: Setting Cut Scores for Examination Programs with Diverse Item Types Beth Kalinowski, MBA, Prometric Manny Straehle, PhD, GIAC Objectives 1 2 3 4 Describe the process for implementing a

More information

Survey Question. What are appropriate methods to reaffirm the fairness, validity reliability and general performance of examinations?

Survey Question. What are appropriate methods to reaffirm the fairness, validity reliability and general performance of examinations? Clause 9.3.5 Appropriate methodology and procedures (e.g. collecting and maintaining statistical data) shall be documented and implemented in order to affirm, at justified defined intervals, the fairness,

More information

COMPUTING READER AGREEMENT FOR THE GRE

COMPUTING READER AGREEMENT FOR THE GRE RM-00-8 R E S E A R C H M E M O R A N D U M COMPUTING READER AGREEMENT FOR THE GRE WRITING ASSESSMENT Donald E. Powers Princeton, New Jersey 08541 October 2000 Computing Reader Agreement for the GRE Writing

More information

Validity of a new assessment rubric for a short-answer test of clinical reasoning

Validity of a new assessment rubric for a short-answer test of clinical reasoning Yeung et al. BMC Medical Education (2016) 16:192 DOI 10.1186/s12909-016-0714-1 RESEARCH ARTICLE Validity of a new assessment rubric for a short-answer test of clinical reasoning Euson Yeung 1,7*, Kulamakan

More information

Perception-Based Evidence of Validity

Perception-Based Evidence of Validity Perception-Based Evidence of Validity Tzur M. Karelitz National Institute for Testing & Evaluation (NITE), Israel Charles Secolsky Measurement and Evaluation Consultant Can public opinion threaten validity?

More information

A Monte Carlo approach for exploring the generalizability of performance standards

A Monte Carlo approach for exploring the generalizability of performance standards University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School 2008 A Monte Carlo approach for exploring the generalizability of performance standards James Thomas Coraggio

More information

Construct validity of postgraduate conjoint assessment of master of surgery program of School of Medical Sciences at USM

Construct validity of postgraduate conjoint assessment of master of surgery program of School of Medical Sciences at USM ORIGINAL ARTICLE Volume 4 Issue 2 2012 DOI:10.5959/eimj.v4i2.107 www.eduimed.com Construct validity of postgraduate conjoint assessment of master of surgery program of School of Medical Sciences at USM

More information

Comparing standard toughness through weighted and unweighted scores by three standard setting procedures

Comparing standard toughness through weighted and unweighted scores by three standard setting procedures Comparing standard toughness through weighted and unweighted scores by three standard setting procedures Abstract Tsai-Wei Huang National Chiayi University, Taiwan Ayres G. D Costa The Ohio State University

More information

Enhancing Rigour in Orthopaedic Manual Physical Therapy Certification: Development of an Assessment Rubric for a Test of Clinical Reasoning

Enhancing Rigour in Orthopaedic Manual Physical Therapy Certification: Development of an Assessment Rubric for a Test of Clinical Reasoning Enhancing Rigour in Orthopaedic Manual Physical Therapy Certification: Development of an Assessment Rubric for a Test of Clinical Reasoning by Euson Yeung A thesis submitted in conformity with the requirements

More information

The DLOSCE: A National Standardized High-Stakes Dental Licensure Examination. Richard C. Black, D.D.S., M.S. David M. Waldschmidt, Ph.D.

The DLOSCE: A National Standardized High-Stakes Dental Licensure Examination. Richard C. Black, D.D.S., M.S. David M. Waldschmidt, Ph.D. The DLOSCE: A National Standardized High-Stakes Dental Licensure Examination Richard C. Black, D.D.S., M.S. Chair, DLOSCE Steering Committee David M. Waldschmidt, Ph.D. Director, ADA Department of Testing

More information

Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory

Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory Baig and Violato BMC Medical Education 2012, 12:121 RESEARCH ARTICLE Open Access Temporal stability of objective structured clinical exams: a longitudinal study employing item response theory Lubna A Baig

More information

ANNUAL REPORT OF POST-PRIMARY EXAMINATIONS 2014

ANNUAL REPORT OF POST-PRIMARY EXAMINATIONS 2014 ANNUAL REPORT OF POST-PRIMARY EXAMINATIONS 214 INTRODUCTION The ARRT offers two categories of certification and registration: primary and post-primary. This report summarizes the results of the 214 post-primary

More information

Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014

Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014 Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014 Studies under review ELA event Mathematics event Duckor, B., Castellano, K., Téllez, K., & Wilson, M. (2013, April). Validating

More information

Validation of an Analytic Rating Scale for Writing: A Rasch Modeling Approach

Validation of an Analytic Rating Scale for Writing: A Rasch Modeling Approach Tabaran Institute of Higher Education ISSN 2251-7324 Iranian Journal of Language Testing Vol. 3, No. 1, March 2013 Received: Feb14, 2013 Accepted: March 7, 2013 Validation of an Analytic Rating Scale for

More information

On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA

On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA MASARY K UNIVERSITY, CZECH REPUBLIC Overview Background and research aims Focus on RQ2 Introduction

More information

Evaluation of Validity and Validation by Means of the Argument-based Approach

Evaluation of Validity and Validation by Means of the Argument-based Approach Evaluation of Validity and Validation by Means of the Argument-based Approach Saskia Wools *, Theo Eggen **, Piet Sanders *** 1. Theoretical Framework 1.1. Introduction One of the current trends in education

More information

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors for Performance Ratings

The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors for Performance Ratings 0 The Impact of Statistically Adjusting for Rater Effects on Conditional Standard Errors for Performance Ratings Mark R. Raymond, Polina Harik and Brain E. Clauser National Board of Medical Examiners 1

More information

Manfred M. Straehle, Ph.D. Prometric, Inc. Rory McCorkle, MBA Project Management Institute (PMI)

Manfred M. Straehle, Ph.D. Prometric, Inc. Rory McCorkle, MBA Project Management Institute (PMI) Manfred M. Straehle, Ph.D. Prometric, Inc. Rory McCorkle, MBA Project Management Institute (PMI) A Poster Session Presented at the 2009 Annual Convention of the American Psychological Association Presentation

More information

Cognitive Behavioural Psychotherapy for Affective Disorders SI Code

Cognitive Behavioural Psychotherapy for Affective Disorders SI Code Title Cognitive Behavioural Psychotherapy for Affective Disorders SI Code 66-7588-00N Semester of Delivery 1 & 2 State whether module is Mandatory Mandatory, Elective or Option Level 7 Credit Points 20

More information

Resources 1/21/2014. Module 3: Validity and Reliability Ensuring the Rigor of Your Assessment Study January, Schuh, chapter 5: Instrumentation

Resources 1/21/2014. Module 3: Validity and Reliability Ensuring the Rigor of Your Assessment Study January, Schuh, chapter 5: Instrumentation Module 3: Validity and Reliability Ensuring the Rigor of Your Assessment Study January, 2014 Jeremy Penn, Ph.D. Director Follow-through 1. Asking a Question (Purpose) Plan 6. Celebrating results (Followthrough

More information

Comparing Vertical and Horizontal Scoring of Open-Ended Questionnaires

Comparing Vertical and Horizontal Scoring of Open-Ended Questionnaires A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

More information

Improving sound quality measures through the multifaceted soundscape approach

Improving sound quality measures through the multifaceted soundscape approach Improving sound quality measures through the multifaceted soundscape approach Brigitte SCHULTE-FORTKAMP 1 1 Technische Universität Berlin, Germany ABSTRACT Soundscape research represents a paradigm shift

More information

DEMONSTRATING COMPETENCE THROUGH NATIONAL STANDARDIZED ASSESSMENT PROGRAMS:

DEMONSTRATING COMPETENCE THROUGH NATIONAL STANDARDIZED ASSESSMENT PROGRAMS: DEMONSTRATING COMPETENCE THROUGH NATIONAL STANDARDIZED ASSESSMENT PROGRAMS: THE US EXPERIENCE FOR D.O.S John R. Gimpel, DO, MEd Sunday, September 1, 2017 Auckland, NZ NATIONAL BOARD OF OSTEOPATHIC MEDICAL

More information

Development and Validation of Oral and Written Examinations for Medical Interpreter Certification. Technical Report

Development and Validation of Oral and Written Examinations for Medical Interpreter Certification. Technical Report Development and Validation of Oral and Written Examinations for Medical Interpreter Certification Technical Report April 2010 PSI Services LLC www.psionline.com 2010, PSI Services LLC. All rights reserved.

More information

Rater Reliability on Criterionreferenced Speaking Tests in IELTS and Joint Venture Universities

Rater Reliability on Criterionreferenced Speaking Tests in IELTS and Joint Venture Universities Lee, J. (2014). Rater reliability on criterion-referenced speaking tests in IELTS and Joint Venture Universities. English Teaching in China, 4, 16-20. Rater Reliability on Criterionreferenced Speaking

More information

Influences of IRT Item Attributes on Angoff Rater Judgments

Influences of IRT Item Attributes on Angoff Rater Judgments Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts

More information

Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment

Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment Advances in Health Sciences Education (2007) 12:239 260 Ó Springer 2006 DOI 10.1007/s10459-006-9043-1 Broadening Perspectives on Clinical Performance Assessment: Rethinking the Nature of In-training Assessment

More information

545 Fifth Avenue, 14th Floor Tel: +1 (212) New York, New York Fax: +1 (212) Internet:

545 Fifth Avenue, 14th Floor Tel: +1 (212) New York, New York Fax: +1 (212) Internet: Committee INTERNATIONAL FEDERATION OF ACCOUNTANTS 545 Fifth Avenue, 14th Floor Tel: +1 (212) 286-9344 New York, New York 10017 Fax: +1 (212) 856-9420 Internet: http://www.ifac.org Ethics Committee Meeting

More information

Paper 908 BOOSTING INTER-RATER RELIABILITY

Paper 908 BOOSTING INTER-RATER RELIABILITY Paper 908 Using The SAS System To Examine The Effect Of Augmentation On The Inter-Rater Reliability Of Holistic Ratings Jim Penny, Center for Creative Leadership, Greensboro, NC Robert L. Johnson, University

More information

Holistic rubric vs. analytic rubric for measuring clinical performance levels in medical students

Holistic rubric vs. analytic rubric for measuring clinical performance levels in medical students Yune et al. BMC Medical Education (2018) 18:124 https://doi.org/10.1186/s12909-018-1228-9 RESEARCH ARTICLE Open Access Holistic rubric vs. analytic rubric for measuring clinical performance levels in medical

More information

Using Differential Item Functioning to Test for Inter-rater Reliability in Constructed Response Items

Using Differential Item Functioning to Test for Inter-rater Reliability in Constructed Response Items University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations May 215 Using Differential Item Functioning to Test for Inter-rater Reliability in Constructed Response Items Tamara Beth

More information

A Paediatric scenario in on call physiotherapy simulation training can improve self reported competency

A Paediatric scenario in on call physiotherapy simulation training can improve self reported competency A Paediatric scenario in on call physiotherapy simulation training can improve self reported competency Lock K., Clarke H., Burrell F., Berry M. Background Qualified physiotherapists are expected to participate

More information

Psychometrics for Beginners. Lawrence J. Fabrey, PhD Applied Measurement Professionals

Psychometrics for Beginners. Lawrence J. Fabrey, PhD Applied Measurement Professionals Psychometrics for Beginners Lawrence J. Fabrey, PhD Applied Measurement Professionals Learning Objectives Identify key NCCA Accreditation requirements Identify two underlying models of measurement Describe

More information

An Investigation of Two Criterion-Referencing Scoring Procedures for National Board Dental Examinations

An Investigation of Two Criterion-Referencing Scoring Procedures for National Board Dental Examinations Loyola University Chicago Loyola ecommons Master's Theses Theses and Dissertations 1980 An Investigation of Two Criterion-Referencing Scoring Procedures for National Board Dental Examinations Maribeth

More information

Resources for Assessment Development

Resources for Assessment Development Resources for Assessment Development A Bibliography for the Assessment Community Prepared for the National Council on Measurement in Education January, 2010 Ian Hembry and Anthony Fina University of Iowa

More information

EFFECTS OF ITEM-LEVEL FEEDBACK ON THE RATINGS PROVIDED BY JUDGES IN A MODIFIED-ANGOFF STANDARD SETTING STUDY

EFFECTS OF ITEM-LEVEL FEEDBACK ON THE RATINGS PROVIDED BY JUDGES IN A MODIFIED-ANGOFF STANDARD SETTING STUDY University of Kentucky UKnowledge Theses and Dissertations--Education Science College of Education 2014 EFFECTS OF ITEM-LEVEL FEEDBACK ON THE RATINGS PROVIDED BY JUDGES IN A MODIFIED-ANGOFF STANDARD SETTING

More information

Seminar 11: Critical Systems

Seminar 11: Critical Systems Seminar 11: Critical Systems Critical/Pragmatic approaches to Systems Thinking Critical Systems Heuristics Boundary Critique 2012 Steve Easterbrook. This presentation is available free for non-commercial

More information

The Influence of Test Characteristics on the Detection of Aberrant Response Patterns

The Influence of Test Characteristics on the Detection of Aberrant Response Patterns The Influence of Test Characteristics on the Detection of Aberrant Response Patterns Steven P. Reise University of California, Riverside Allan M. Due University of Minnesota Statistical methods to assess

More information

When the American Board of Internal Medicine

When the American Board of Internal Medicine The Mini-CEX: A Method for Assessing Clinical Skills John J. Norcini, PhD; Linda L. Blank; F. Daniel Duffy, MD; and Gregory S. Fortna, MSEd Academia and Clinic Objective: To evaluate the mini clinical

More information

I M REGISTERED. Promoting your status as a registered health professional. A guide for osteopaths

I M REGISTERED. Promoting your status as a registered health professional. A guide for osteopaths I M REGISTERED Promoting your status as a registered health professional A guide for osteopaths 2 Why is it important for me to raise awareness of my registration? Patients want to be assured that you

More information

AMERICAN BOARD OF SURGERY 2009 IN-TRAINING EXAMINATION EXPLANATION & INTERPRETATION OF SCORE REPORTS

AMERICAN BOARD OF SURGERY 2009 IN-TRAINING EXAMINATION EXPLANATION & INTERPRETATION OF SCORE REPORTS AMERICAN BOARD OF SURGERY 2009 IN-TRAINING EXAMINATION EXPLANATION & INTERPRETATION OF SCORE REPORTS Attached are the performance reports and analyses for participants from your surgery program on the

More information

Issues in Clinical Measurement

Issues in Clinical Measurement Issues in Clinical Measurement MERMAID Series January 15, 2016 Galen E. Switzer, PhD Clinical and Translational Science Institute University of Pittsburgh What is Measurement? observation of people, clinical

More information

Detecting Suspect Examinees: An Application of Differential Person Functioning Analysis. Russell W. Smith Susan L. Davis-Becker

Detecting Suspect Examinees: An Application of Differential Person Functioning Analysis. Russell W. Smith Susan L. Davis-Becker Detecting Suspect Examinees: An Application of Differential Person Functioning Analysis Russell W. Smith Susan L. Davis-Becker Alpine Testing Solutions Paper presented at the annual conference of the National

More information

Using Performance Assessments to Determine Competence in Clinical Athletic Training Education: How Valid Are Our Assessments?

Using Performance Assessments to Determine Competence in Clinical Athletic Training Education: How Valid Are Our Assessments? ATHLETIC TRAINING EDUCATION JOURNAL Q National Athletic Trainers Association www.natajournals.org ISSN: 1947-380X DOI: 10.4085/0903135 EDUCATIONAL TECHNIQUE Using Performance Assessments to Determine Competence

More information

Peer assessment of competence

Peer assessment of competence The Metric of Medical Education Peer assessment of competence John J Norcini Objective This instalment in the series on professional assessment summarises how peers are used in the evaluation process and

More information

A new scale for the assessment of competences in Cognitive and Behavioural Therapy. Anthony D. Roth. University College London, UK

A new scale for the assessment of competences in Cognitive and Behavioural Therapy. Anthony D. Roth. University College London, UK A new scale for the assessment of competences in Cognitive and Behavioural Therapy Anthony D. Roth University College London, UK Abstract Background: Scales for assessing competence in CBT make an important

More information

Using Generalizability Theory to Investigate the Psychometric Property of an Assessment Center in Indonesia

Using Generalizability Theory to Investigate the Psychometric Property of an Assessment Center in Indonesia Using Generalizability Theory to Investigate the Psychometric Property of an Assessment Center in Indonesia Urip Purwono received his Ph.D. (psychology) from the University of Massachusetts at Amherst,

More information

A laboratory study on the reliability estimations of the mini-cex

A laboratory study on the reliability estimations of the mini-cex Adv in Health Sci Educ (2013) 18:5 13 DOI 10.1007/s10459-011-9343-y A laboratory study on the reliability estimations of the mini-cex Alberto Alves de Lima Diego Conde Juan Costabel Juan Corso Cees Van

More information

Necessity of introducing postencounter note describing history and physical examination at clinical performance examination in Korea

Necessity of introducing postencounter note describing history and physical examination at clinical performance examination in Korea ORIGINAL ARTICLE Necessity of introducing postencounter note describing history and physical examination at clinical performance examination in Korea Jonghoon Kim Office of Medical Education, Inha University

More information

Dual processing theory and experts reasoning: exploring thinking on national multiple-choice questions

Dual processing theory and experts reasoning: exploring thinking on national multiple-choice questions Perspect Med Educ (2015) 4:168 175 DOI 10.1007/s40037-015-0196-6 Original Article Dual processing theory and experts reasoning: exploring thinking on national multiple-choice questions Steven J. Durning

More information

Providing Evidence for the Generalizability of a Speaking Placement Test Scores

Providing Evidence for the Generalizability of a Speaking Placement Test Scores Providing Evidence for the Generalizability of a Speaking Placement Test Scores Payman Vafaee 1, Behrooz Yaghmaeyan 2 Received: 15 April 2015 Accepted: 10 August 2015 Abstract Three major potential sources

More information

SUMMARY REPORT 1. EXECUTIVE SUMMARY. Program provider. University of Sydney

SUMMARY REPORT 1. EXECUTIVE SUMMARY. Program provider. University of Sydney AUSTRALIAN DENTAL COUNCIL REPORT OF AN EVALUATION OF UNIVERSITY OF SYDNEY DOCTOR OF CLINCAL DENTISTRY PROGRAMS IN: ORAL MEDICINE ORTHODONTICS PAEDIATRIC DENTISTRY PERIODONTICS PROSTHODONTICS SPECIAL CARE

More information

STATISTICAL CONCLUSION VALIDITY

STATISTICAL CONCLUSION VALIDITY Validity 1 The attached checklist can help when one is evaluating the threats to validity of a study. VALIDITY CHECKLIST Recall that these types are only illustrative. There are many more. INTERNAL VALIDITY

More information

Performance Evaluation Tool

Performance Evaluation Tool FSBPT Performance Evaluation Tool Foreign Educated Therapists Completing a Supervised Clinical Practice The information contained in this document is proprietary and not to be shared elsewhere. Contents

More information

Measuring and Assessing Study Quality

Measuring and Assessing Study Quality Measuring and Assessing Study Quality Jeff Valentine, PhD Co-Chair, Campbell Collaboration Training Group & Associate Professor, College of Education and Human Development, University of Louisville Why

More information

Reliability Study of ACTFL OPIc in Spanish, English, and Arabic for the ACE Review

Reliability Study of ACTFL OPIc in Spanish, English, and Arabic for the ACE Review Reliability Study of ACTFL OPIc in Spanish, English, and Arabic for the ACE Review Prepared for: American Council on the Teaching of Foreign Languages (ACTFL) White Plains, NY Prepared by SWA Consulting

More information

On the same page? The effect of GP examiner feedback on differences in rating severity in clinical assessments: a pre/post intervention study

On the same page? The effect of GP examiner feedback on differences in rating severity in clinical assessments: a pre/post intervention study Sturman et al. BMC Medical Education (2017) 17:101 DOI 10.1186/s12909-017-0929-9 RESEARCH ARTICLE On the same page? The effect of GP examiner feedback on differences in rating severity in clinical assessments:

More information

AOBEM Part I Certification Examination Public Reporting

AOBEM Part I Certification Examination Public Reporting AOBEM Part I Certification Examination Public Reporting 2011-2016 August 24, 2017 Authored by the NBOME 1 Executive Summary Passing the American Osteopathic Board of Emergency Medicine (AOBEM) Part I Certification

More information

Understanding the influence of different cycles of an OSCE exam on students scores using Many Facet Rasch Modelling.

Understanding the influence of different cycles of an OSCE exam on students scores using Many Facet Rasch Modelling. Hawks, Doves and Rasch decisions Understanding the influence of different cycles of an OSCE exam on students scores using Many Facet Rasch Modelling. Authors: Peter Yeates Stefanie S. Sebok-Syer Corresponding

More information

The Public Health Approach to Palliative Care

The Public Health Approach to Palliative Care The Public Health Approach to Palliative Care Principles, Models, and International Perspectives A White Paper for The BC Centre for Palliative Care August 2015 Prepared by: Eman Hassan MD. MPH. PhDc QualiHealth

More information

Comparison of Alternative Scoring Methods for a. Computerized Performance Assessment of Clinical Judgment. Polina Harik, Brian Clauser, Peter Baldwin

Comparison of Alternative Scoring Methods for a. Computerized Performance Assessment of Clinical Judgment. Polina Harik, Brian Clauser, Peter Baldwin Comparison of Alternative Scoring Methods for a Computerized Performance Assessment of Clinical Judgment Polina Harik, Brian Clauser, Peter Baldwin Concurrent growth in computer-based testing and the use

More information

Patient-Centered Measurement: Innovation Challenge Series

Patient-Centered Measurement: Innovation Challenge Series Patient-Centered Measurement: Innovation Challenge Series Learning Collaborative 2018 Webinar Thursday, March 1, 2018 1 Hala Durrah, MTA Patient Family Engagement Consultant, Speaker & Advocate 3 2017-18

More information

Agreement Coefficients and Statistical Inference

Agreement Coefficients and Statistical Inference CHAPTER Agreement Coefficients and Statistical Inference OBJECTIVE This chapter describes several approaches for evaluating the precision associated with the inter-rater reliability coefficients of the

More information

Development, administration, and validity evidence of a subspecialty preparatory test toward licensure: a pilot study

Development, administration, and validity evidence of a subspecialty preparatory test toward licensure: a pilot study Johnson et al. BMC Medical Education (2018) 18:176 https://doi.org/10.1186/s12909-018-1294-z RESEARCH ARTICLE Open Access Development, administration, and validity evidence of a subspecialty preparatory

More information

the metric of medical education

the metric of medical education the metric of medical education Validity threats: overcoming interference with proposed interpretations of assessment data Steven M Downing 1 & Thomas M Haladyna 2 CONTEXT Factors that interfere with the

More information

A general treatment approach

A general treatment approach Chapter 2 A general treatment approach Using the rubric of evidence-based medicine Evidence-based medicine (EBM) is not just about the evidence, but how to use it in a meaningful way [1]; practicing EBM

More information

Minnesota Region V State Report. September 27 & 28, 2010 Indianapolis, Indiana

Minnesota Region V State Report. September 27 & 28, 2010 Indianapolis, Indiana Minnesota Region V State Report September 27 & 28, 2010 Indianapolis, Indiana Minnesota ASD Prevelance Similar Trends As With Other States National Survey of Children with Special Health Care Needs approximately

More information

AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth. BradyWest

AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth. BradyWest AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth BradyWest An Examination of the Quality and Utility of Interviewer Estimates of Household

More information

Addendum Valorization paragraph

Addendum Valorization paragraph Addendum Valorization paragraph 171 1. (Relevance) What is the social (and/or economic) relevance of your research results (i.e. in addition to the scientific relevance)? Variability in ratings, human

More information

By Hui Bian Office for Faculty Excellence

By Hui Bian Office for Faculty Excellence By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys

More information

RECOMMENDATIONS FOR STANDARDIZATION OF THE MCC 360 SCALE

RECOMMENDATIONS FOR STANDARDIZATION OF THE MCC 360 SCALE RECOMMENDATIONS FOR STANDARDIZATION OF THE MCC 360 SCALE Marguerite Roy, Medical Council of Canada Cindy Streefkerk, Medical Council of Canada 2017 Anticipated changes to the MCC 360 questionnaires At

More information

Combining Dual Scaling with Semi-Structured Interviews to Interpret Rating Differences

Combining Dual Scaling with Semi-Structured Interviews to Interpret Rating Differences A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

More information

Evaluating Client Satisfaction in Psychiatric Group Homes: A Participatory Stakeholder Approach

Evaluating Client Satisfaction in Psychiatric Group Homes: A Participatory Stakeholder Approach Evaluating Client Satisfaction in Psychiatric Group Homes: A Participatory Stakeholder Approach Myra Piat,, Ph.D. Douglas Hospital & McGill University 6875 Lasalle Blvd, Verdun, Quebec H4H 1R3 myra.piat

More information

Online Annexes (2-4)

Online Annexes (2-4) Online Annexes (2-4) to WHO Policy update: The use of molecular line probe assays for the detection of resistance to isoniazid and rifampicin THE END TB STRATEGY Online Annexes (2-4) to WHO Policy update:

More information

Maintaining performance standards: aligning raw score scales on different tests via a latent trait created by rank-ordering examinees' work

Maintaining performance standards: aligning raw score scales on different tests via a latent trait created by rank-ordering examinees' work Maintaining performance standards: aligning raw score scales on different tests via a latent trait created by rank-ordering examinees' work Tom Bramley & Beth Black Paper presented at the Third International

More information

ASSIGNMENT TYPE QUESTIONS

ASSIGNMENT TYPE QUESTIONS THE SOUTH AFRICAN COUNCIL F THE QUANTITY SURVEYING PROFESSION DEMONSTRATE AN UNDERSTANDING OF PROFESSIONAL ETHICS ASSIGNMENT TYPE QUESTIONS The questions listed below would typically be provided to candidates

More information