Raters, rating and reliability

Size: px
Start display at page:

Download "Raters, rating and reliability"

Transcription

1 Raters, rating and reliability

2 Test Taker Context Cognitive Response Scoring Validity Score/Grade Consequential Validity Criterion- Related Validity

3 How far can we depend on the scores which result from the test? our focus here is on Scoring validity

4 a super-ordinate term for all aspects of reliability (see Chapter 6 in Weigle 2002 and Chapter 5 in Shaw and Weir 2007) accounts for the extent to which test scores are based on appropriate criteria exhibit consensual agreement in marking as free as possible from measurement error stable over time consistent in terms of their content sampling inspire confidence as reliable decision-making indicators

5 linked directly to both cognitive and context validity test construct as a triangular relationship an interactionalist position, which sees the (writing) construct as residing in the interactions between the underlying cognitive ability, the context of use and the process of scoring

6 Rating criteria / rating scale rating procedures rater selection rater training standardisation rating conditions moderation statistical analysis raters grading and awarding

7 Writing task design Assessment criteria Validity authenticity Reliability Impact Practicality

8 Validity traditionally the most important examination quality concerns appropriateness & meaningfulness of an exam in specific educational context specific inferences made from exam results Task authenticity important aspect of validity Reliability contributes to overall validity concerns extent to which test results are stable, consistent, and free from bias and random error

9 Reliability is concerned with minimizing the effects of measurement error, while validity is concerned with maximizing the effects of the language abilities we want to measure (Saville 2003:69) Potential tension between validity and reliability in performance assessment.

10 Validity Direct testing Variety of task types More functions tested Fewer inferences required Administration complex More positive impact on teaching and learning Reliability Indirect testing Fewer task types Fewer functions tested More inferences required Administration simpler Less positive impact on teaching and learning

11 High reliability achieved by narrowing range of task types or range of skills tested. However, restricts interpretations placed on performance in the test, and hence its validity. The key, therefore, is to balance the potential tension between reliability and validity.

12 specification trialling Task standardisation format timing length

13 specification of the content of the assessment using pooled judgements to select content requiring multiple judgements adopting standard procedures basing judgements on specific defined criteria undertaking appropriate training checking validity, reliability by analysing assessment data

14 More test tasks More raters

15 What do you think makes a good rater in terms of: knowledge? skills? qualifications? background? experience? What are the minimum professional requirements?

16 What sort of training will the writing examiner need for their role? Familiarisation with test format and procedure Familiarisation with assessment criteria Initial induction and training, followed by ongoing standardisation/coordination How can this be achieved? Face-to-face Semi-direct Online

17 Let s look at an example of what one test provider does

18 Recruitment Induction Training Evaluation Coordination Monitoring

19 Managing the examiner community Cambridge ESOL Team Leaders Writing Examiners

20 the importance of rater training and standardisation Why? to reduce rater biases: leniency harshness halo effect (different types of halo effect) limited use of the scale

21 A system for routinely monitoring and evaluating the performance of examiners Giving feedback to examiners leading to possible follow-up action

22 to investigate and confirm the quality of raters scoring behaviour in Classical analyses: correlation coefficients % levels of agreement Rasch-based analyses: FACETS program inter/intra-rater reliability estimates, rating scale analysis, task analysis possible scaling of examiners before awarding scores/grades to correct for leniency/harshness

23 Test Taker Context Cognitive Response Scoring Validity Score/Grade Consequential Validity Criterion- Related Validity

Rating the construct reliably

Rating the construct reliably EALTA Summer School, Innsbruck, 2016 Rating the construct reliably Jayanti Banerjee and Claudia Harsch Session Outline What is the rating process? Why do we need rater training? Rater training research

More information

CONSIDERATIONS IN PERFORMANCE-BASED LANGUAGE ASSESSMENT: RATING SCALES AND RATER TRAINING

CONSIDERATIONS IN PERFORMANCE-BASED LANGUAGE ASSESSMENT: RATING SCALES AND RATER TRAINING PASAA Volume 46 July-December 2013 CONSIDERATIONS IN PERFORMANCE-BASED LANGUAGE ASSESSMENT: RATING SCALES AND RATER TRAINING Bordin Chinda Chiang Mai University Abstract Performance-based assessment has

More information

Examining Factors Affecting Language Performance: A Comparison of Three Measurement Approaches

Examining Factors Affecting Language Performance: A Comparison of Three Measurement Approaches Pertanika J. Soc. Sci. & Hum. 21 (3): 1149-1162 (2013) SOCIAL SCIENCES & HUMANITIES Journal homepage: http://www.pertanika.upm.edu.my/ Examining Factors Affecting Language Performance: A Comparison of

More information

Introduction. 1.1 Facets of Measurement

Introduction. 1.1 Facets of Measurement 1 Introduction This chapter introduces the basic idea of many-facet Rasch measurement. Three examples of assessment procedures taken from the field of language testing illustrate its context of application.

More information

Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations

Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations Catching the Hawks and Doves: A Method for Identifying Extreme Examiners on Objective Structured Clinical Examinations July 20, 2011 1 Abstract Performance-based assessments are powerful methods for assessing

More information

ADMS Sampling Technique and Survey Studies

ADMS Sampling Technique and Survey Studies Principles of Measurement Measurement As a way of understanding, evaluating, and differentiating characteristics Provides a mechanism to achieve precision in this understanding, the extent or quality As

More information

Rater Reliability on Criterionreferenced Speaking Tests in IELTS and Joint Venture Universities

Rater Reliability on Criterionreferenced Speaking Tests in IELTS and Joint Venture Universities Lee, J. (2014). Rater reliability on criterion-referenced speaking tests in IELTS and Joint Venture Universities. English Teaching in China, 4, 16-20. Rater Reliability on Criterionreferenced Speaking

More information

Since light travels faster than. sound, people appear bright until. you hear them speak

Since light travels faster than. sound, people appear bright until. you hear them speak Since light travels faster than sound, people appear bright until you hear them speak Oral Exams Pro s and Con s Zeev Goldik Oral examinations Can generate marks unrelated to competence? Oral exams- pro

More information

Reliability and Validity of a Task-based Writing Performance Assessment for Japanese Learners of English

Reliability and Validity of a Task-based Writing Performance Assessment for Japanese Learners of English Reliability and Validity of a Task-based Writing Performance Assessment for Japanese Learners of English Yoshihito SUGITA Yamanashi Prefectural University Abstract This article examines the main data of

More information

BASIC PRINCIPLES OF ASSESSMENT

BASIC PRINCIPLES OF ASSESSMENT TOPIC 4 BASIC PRINCIPLES OF ASSESSMENT 4.0 SYNOPSIS Topic 4 defines the basic principles of assessment (reliability, validity, practicality, washback, and authenticity) and the essential sub-categories

More information

Evaluation of Pseudo-Scoring as an Extension of Rater Training

Evaluation of Pseudo-Scoring as an Extension of Rater Training Evaluation of Pseudo-Scoring as an Extension of Rater Training Research Report Edward W. Wolfe Melodie Jurgens Bob Sanders Daisy Vickers Jessica Yue April 2014 PSEUDO-SCORING 1 About Pearson Everything

More information

Examples of Feedback Comments: How to use them to improve your report writing. Example 1: Compare and contrast

Examples of Feedback Comments: How to use them to improve your report writing. Example 1: Compare and contrast Examples of Feedback Comments: How to use them to improve your report writing This document contains 4 examples of writing and feedback comments from Level 2A lab reports, and 4 steps to help you apply

More information

Process of a neuropsychological assessment

Process of a neuropsychological assessment Test selection Process of a neuropsychological assessment Gather information Review of information provided by referrer and if possible review of medical records Interview with client and his/her relative

More information

Combining Dual Scaling with Semi-Structured Interviews to Interpret Rating Differences

Combining Dual Scaling with Semi-Structured Interviews to Interpret Rating Differences A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to the Practical Assessment, Research & Evaluation. Permission is granted to

More information

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification

Test Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification What is? IOP 301-T Test Validity It is the accuracy of the measure in reflecting the concept it is supposed to measure. In simple English, the of a test concerns what the test measures and how well it

More information

the metric of medical education

the metric of medical education the metric of medical education Validity threats: overcoming interference with proposed interpretations of assessment data Steven M Downing 1 & Thomas M Haladyna 2 CONTEXT Factors that interfere with the

More information

Pediatrics Milestones and Meaningful Assessment Translating the Pediatrics Milestones into Assessment Items for use in the Clinical Setting

Pediatrics Milestones and Meaningful Assessment Translating the Pediatrics Milestones into Assessment Items for use in the Clinical Setting Pediatrics Milestones and Meaningful Assessment Translating the Pediatrics Milestones into Assessment Items for use in the Clinical Setting Ann Burke Susan Guralnick Patty Hicks Jeanine Ronan Dan Schumacher

More information

The moderation of coursework and controlled assessment: A summary

The moderation of coursework and controlled assessment: A summary This is a single article from Research Matters: A Cambridge Assessment publication. http://www.cambridgeassessment.org.uk/research-matters/ UCLES 15 The moderation of coursework and controlled assessment:

More information

10 Intraclass Correlations under the Mixed Factorial Design

10 Intraclass Correlations under the Mixed Factorial Design CHAPTER 1 Intraclass Correlations under the Mixed Factorial Design OBJECTIVE This chapter aims at presenting methods for analyzing intraclass correlation coefficients for reliability studies based on a

More information

Several notable developments on the competency front Highlight: Documents by the APA commissioned Competency Benchmarks Working Group

Several notable developments on the competency front Highlight: Documents by the APA commissioned Competency Benchmarks Working Group A multi-site study on the assessment of clinical psychology competencies by field supervisors: Should vignettes replace rating scales? This initiative is supported by an Australian Learning & Teaching

More information

NZQA Assessment Support Material

NZQA Assessment Support Material NZQA Assessment Support Material Unit standard 8994 Title Examine a social institution Level 2 Credits 4 Version 3 Note The following guidelines are supplied to enable teachers to carry out valid and consistent

More information

Designing Valid Assessments

Designing Valid Assessments Designing Valid Assessments Kanna Hudson, Title III Researcher Office of Planning & Effectiveness Clark College November 30, 2012 Today s participant outcomes 1. Explain that there are multiple facets

More information

Job Description. Inspire East Lancashire Integrated Substance use Service. Service User Involvement & Peer Mentor Co-ordinator

Job Description. Inspire East Lancashire Integrated Substance use Service. Service User Involvement & Peer Mentor Co-ordinator Job Description Service Job Title Base Hours Inspire East Lancashire Integrated Substance use Service Service User Involvement & Peer Mentor Co-ordinator Accrington 37.5 hours per week Salary Range 21,933.15-25,741.93

More information

Study 2a: A level biology, psychology and sociology

Study 2a: A level biology, psychology and sociology Inter-subject comparability studies Study 2a: A level biology, psychology and sociology May 2008 QCA/08/3653 Contents 1 Personnel... 3 2 Materials... 4 3 Methodology... 5 3.1 Form A... 5 3.2 CRAS analysis...

More information

Reliability. Internal Reliability

Reliability. Internal Reliability 32 Reliability T he reliability of assessments like the DECA-I/T is defined as, the consistency of scores obtained by the same person when reexamined with the same test on different occasions, or with

More information

Chapter 4. The Validity of Assessment- Based Interpretations

Chapter 4. The Validity of Assessment- Based Interpretations Chapter 4. The Validity of Assessment- Based Interpretations contents What is validity? Its definition Its importance What are the sorts of evidence of validity? content-related evidence of validity Criterion-related

More information

Cochrane Pregnancy and Childbirth Group Methodological Guidelines

Cochrane Pregnancy and Childbirth Group Methodological Guidelines Cochrane Pregnancy and Childbirth Group Methodological Guidelines [Prepared by Simon Gates: July 2009, updated July 2012] These guidelines are intended to aid quality and consistency across the reviews

More information

Examining the Validity of an Essay Writing Test Using Rasch Analysis

Examining the Validity of an Essay Writing Test Using Rasch Analysis Secondary English Education, 5(2) Examining the Validity of an Essay Writing Test Using Rasch Analysis Taejoon Park (KICE) Park, Taejoon. (2012). Examining the validity of an essay writing test using Rasch

More information

Authors face many challenges when summarising results in reviews.

Authors face many challenges when summarising results in reviews. Describing results Authors face many challenges when summarising results in reviews. This document aims to help authors to develop clear, consistent messages about the effects of interventions in reviews,

More information

IELTS Partnership Research Papers

IELTS Partnership Research Papers ISSN 2515-1703 2016 IELTS Partnership Research Papers Exploring performance across two delivery modes for the same L2 speaking test: Face-to-face and video-conferencing delivery A preliminary comparison

More information

Rater Effects as a Function of Rater Training Context

Rater Effects as a Function of Rater Training Context Rater Effects as a Function of Rater Training Context Edward W. Wolfe Aaron McVay Pearson October 2010 Abstract This study examined the influence of rater training and scoring context on the manifestation

More information

Ursuline College Accelerated Program

Ursuline College Accelerated Program Ursuline College Accelerated Program CRITICAL INFORMATION! DO NOT SKIP THIS LINK BELOW... BEFORE PROCEEDING TO READ THE UCAP MODULE, YOU ARE EXPECTED TO READ AND ADHERE TO ALL UCAP POLICY INFORMATION CONTAINED

More information

Importance of Good Measurement

Importance of Good Measurement Importance of Good Measurement Technical Adequacy of Assessments: Validity and Reliability Dr. K. A. Korb University of Jos The conclusions in a study are only as good as the data that is collected. The

More information

COMPUTING READER AGREEMENT FOR THE GRE

COMPUTING READER AGREEMENT FOR THE GRE RM-00-8 R E S E A R C H M E M O R A N D U M COMPUTING READER AGREEMENT FOR THE GRE WRITING ASSESSMENT Donald E. Powers Princeton, New Jersey 08541 October 2000 Computing Reader Agreement for the GRE Writing

More information

GUIDELINES FOR SCHOOL PEER GUIDE CO-ORDINATORS

GUIDELINES FOR SCHOOL PEER GUIDE CO-ORDINATORS GUIDELINES FOR SCHOOL PEER GUIDE CO-ORDINATORS These guidelines should be read in conjunction with the Code of Practice for the Peer Guide Scheme and the Guidelines for Peer Guides and Potential Peer Guides.

More information

On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA

On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA MASARY K UNIVERSITY, CZECH REPUBLIC Overview Background and research aims Focus on RQ2 Introduction

More information

Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education.

Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education. The Reliability of PLATO Running Head: THE RELIABILTY OF PLATO Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO M. Ken Cor Stanford University School of Education April,

More information

The Psychometric Principles Maximizing the quality of assessment

The Psychometric Principles Maximizing the quality of assessment Summer School 2009 Psychometric Principles Professor John Rust University of Cambridge The Psychometric Principles Maximizing the quality of assessment Reliability Validity Standardisation Equivalence

More information

Providing Evidence for the Generalizability of a Speaking Placement Test Scores

Providing Evidence for the Generalizability of a Speaking Placement Test Scores Providing Evidence for the Generalizability of a Speaking Placement Test Scores Payman Vafaee 1, Behrooz Yaghmaeyan 2 Received: 15 April 2015 Accepted: 10 August 2015 Abstract Three major potential sources

More information

Brighton & Hove Food Partnership: Harvest

Brighton & Hove Food Partnership: Harvest Growing Health Food growing for health and wellbeing Brighton & Hove Food Partnership: Harvest Brighton & Hove Growing Health Case Study Health area: Healthy eating, physical activity and mental wellbeing

More information

University of Bradford School of Health Studies Division of Physiotherapy and Occupational Therapy Programme specification

University of Bradford School of Health Studies Division of Physiotherapy and Occupational Therapy Programme specification University of Bradford School of Health Studies Division of Physiotherapy and Occupational Therapy Programme specification Awarding and Teaching institution: University of Bradford Final award: Postgraduate

More information

Effects of Different Training and Scoring Approaches on Human Constructed Response Scoring. Walter D. Way. Daisy Vickers. Paul Nichols.

Effects of Different Training and Scoring Approaches on Human Constructed Response Scoring. Walter D. Way. Daisy Vickers. Paul Nichols. Effects of Different Training and Scoring Approaches on Human Constructed Response Scoring Walter D. Way Daisy Vickers Paul Nichols Pearson 1 Paper presented at the annual meeting of the National Council

More information

Shiken: JALT Testing & Evaluation SIG Newsletter. 12 (2). April 2008 (p )

Shiken: JALT Testing & Evaluation SIG Newsletter. 12 (2). April 2008 (p ) Rasch Measurementt iin Language Educattiion Partt 2:: Measurementt Scalles and Invariiance by James Sick, Ed.D. (J. F. Oberlin University, Tokyo) Part 1 of this series presented an overview of Rasch measurement

More information

The Evaluation of Children with Deaf-Blindness: A Parent Mini-Guide

The Evaluation of Children with Deaf-Blindness: A Parent Mini-Guide Statewide Parent Advocacy Network 35 Halsey Street Newark, NJ 07102 (973) 642-8100 www.spannj.org The Evaluation of Children with Deaf-Blindness: A Parent Mini-Guide Developed by the Statewide Parent Advocacy

More information

Evaluating the quality of analytic ratings with Mokken scaling

Evaluating the quality of analytic ratings with Mokken scaling Psychological Test and Assessment Modeling, Volume 57, 2015 (3), 423-444 Evaluating the quality of analytic ratings with Mokken scaling Stefanie A. Wind 1 Abstract Greatly influenced by the work of Rasch

More information

PÄIVI KARHU THE THEORY OF MEASUREMENT

PÄIVI KARHU THE THEORY OF MEASUREMENT PÄIVI KARHU THE THEORY OF MEASUREMENT AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition

More information

Metanoia Institute 13 North Common Road Ealing London W5 2QB. Telephone: Fax:

Metanoia Institute 13 North Common Road Ealing London W5 2QB. Telephone: Fax: PSYCHOTHERAPY CONVERSION COURSE FOR QUALIFIED AND EXPERIENCED PERSON CENTRED COUNSELLORS MSc In Person-Centred Psychotherapy and its Applications STARTS SEPTEMBER 2018 Faculty Head: Heather Fowlie Programme

More information

How Do We Assess Students in the Interpreting Examinations?

How Do We Assess Students in the Interpreting Examinations? How Do We Assess Students in the Interpreting Examinations? Fred S. Wu 1 Newcastle University, United Kingdom The field of assessment in interpreter training is under-researched, though trainers and researchers

More information

Basic concepts and principles of classical test theory

Basic concepts and principles of classical test theory Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must

More information

The Psychological drivers that propel and sustain women and men into leadership positions.

The Psychological drivers that propel and sustain women and men into leadership positions. PILOT RESEARCH SUMMARY The Psychological drivers that propel and sustain women and men into leadership positions. June 2017 Marie Burns BSc Hons, MSc, Ch Psych. OVERVIEW Despite the benefits of a strong,

More information

- Purposive sampling: Selecting a group of individuals who are likely to have relevant

- Purposive sampling: Selecting a group of individuals who are likely to have relevant Cipriani!1 Patten Part H Outline 1. Sampling in Qualitative Research: I - Purposive sampling: Selecting a group of individuals who are likely to have relevant information to the research topic. Participants

More information

CEMO RESEARCH PROGRAM

CEMO RESEARCH PROGRAM 1 CEMO RESEARCH PROGRAM Methodological Challenges in Educational Measurement CEMO s primary goal is to conduct basic and applied research seeking to generate new knowledge in the field of educational measurement.

More information

Sample Exam Questions Psychology 3201 Exam 1

Sample Exam Questions Psychology 3201 Exam 1 Scientific Method Scientific Researcher Scientific Practitioner Authority External Explanations (Metaphysical Systems) Unreliable Senses Determinism Lawfulness Discoverability Empiricism Control Objectivity

More information

UNIVERSITY OF CALGARY. Reliability & Validity of the. Objective Structured Clinical Examination (OSCE): A Meta-Analysis. Ibrahim Al Ghaithi A THESIS

UNIVERSITY OF CALGARY. Reliability & Validity of the. Objective Structured Clinical Examination (OSCE): A Meta-Analysis. Ibrahim Al Ghaithi A THESIS UNIVERSITY OF CALGARY Reliability & Validity of the Objective Structured Clinical Examination (OSCE): A Meta-Analysis by Ibrahim Al Ghaithi A THESIS SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL

More information

Developing Skills at Making Observations

Developing Skills at Making Observations Developing Skills at Making Observations Lessons from Faculty Development and Rater Cognition Research Eric S. Holmboe Jennifer R. Kogan Roadmap 1. Define workplace based assessment and the theories supporting

More information

Miller s Assessment Pyramid

Miller s Assessment Pyramid Roadmap Developing Skills at Making Observations Lessons from Faculty Development and Rater Cognition Research 1. Define workplace based assessment and the theories supporting direct observation 2. Identify

More information

Assisting Novice Raters in Addressing the In- Between Scores When Rating Writing

Assisting Novice Raters in Addressing the In- Between Scores When Rating Writing Brigham Young University BYU ScholarsArchive All Theses and Dissertations 2013-06-16 Assisting Novice Raters in Addressing the In- Between Scores When Rating Writing Brittney Greer Brigham Young University

More information

THE PROFESSIONAL BOARD FOR PSYCHOLOGY HEALTH PROFESSIONS COUNCIL OF SOUTH AFRICA TEST DEVELOPMENT / ADAPTATION PROPOSAL FORM

THE PROFESSIONAL BOARD FOR PSYCHOLOGY HEALTH PROFESSIONS COUNCIL OF SOUTH AFRICA TEST DEVELOPMENT / ADAPTATION PROPOSAL FORM FORM A THE PROFESSIONAL BOARD FOR PSYCHOLOGY HEALTH PROFESSIONS COUNCIL OF SOUTH AFRICA TEST DEVELOPMENT / ADAPTATION PROPOSAL FORM This document consists of two sections. Please complete section 1 if

More information

AS Psychology Curriculum Plan & Scheme of work

AS Psychology Curriculum Plan & Scheme of work AS Psychology Curriculum Plan & Scheme of work 2015-16 Week Content Further detail and reference to specification H/w. Reading & Notes, Resources, Extension activities Hodder textbook pages 1-12 Hodder

More information

CLUB RETENTION CHAIRPERSON RESOURCE

CLUB RETENTION CHAIRPERSON RESOURCE RETENTION Club Retention Chairperson How are Your Ratings? President's Retention Campaign Member Orientation Lions Mentoring Program Resource Excellence Award Initiative Club Health Assessment Club Rebuilding

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Key words: classical test theory; many-facet Rasch measurement; reliability; bias analysis

Key words: classical test theory; many-facet Rasch measurement; reliability; bias analysis 2010 年 4 月中国应用语言学 ( 双月刊 ) Apr. 2010 第 33 卷第 2 期 Chinese Journal of Applied Linguistics (Bimonthly) Vol. 33 No. 2 An Application of Classical Test Theory and Manyfacet Rasch Measurement in Analyzing the

More information

DPROF2 - Professional Development II

DPROF2 - Professional Development II Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2014 801 - EUNCET - Euncet University Business School 801 - EUNCET - Euncet University Business School BACHELOR'S DEGREE IN BUSINESS

More information

Investigating possible ethnicity and sex bias in clinical examiners: an analysis of data from the MRCP(UK) PACES and npaces examinations

Investigating possible ethnicity and sex bias in clinical examiners: an analysis of data from the MRCP(UK) PACES and npaces examinations McManus et al. BMC Medical Education 2013, 13:103 RESEARCH ARTICLE Open Access Investigating possible ethnicity and sex bias in clinical examiners: an analysis of data from the MRCP(UK) and n examinations

More information

Introduction to Reliability

Introduction to Reliability Reliability Thought Questions: How does/will reliability affect what you do/will do in your future job? Which method of reliability analysis do you find most confusing? Introduction to Reliability What

More information

LFI Leadership Competencies

LFI Leadership Competencies LFI Leadership Competencies LFI L E A D E R S H I P C O M P E T E N C I E S S E S S I O N S O V E R V I E W A N D S E L F A W A R E N E S S O C T O B E R 2 0 1 4 LFI Leadership Competencies Servant Leadership

More information

Reference Supplement

Reference Supplement Reference Supplement to the Manual for Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment Section H: Many-Facet Rasch Measurement

More information

Developing an Analytic Scale for Scoring EFL Descriptive Writing*

Developing an Analytic Scale for Scoring EFL Descriptive Writing* Journal of English Language Teaching and Learning Tabriz University No. 17, 2016 Developing an Analytic Scale for Scoring EFL Descriptive Writing* Mohammad Khatib Associate Professor of TEFL, Allameh Tabataba'i

More information

218 Part III Review Exercises

218 Part III Review Exercises 218 Part III Review Eercises Part III Review Eercises III.1 (a The proofreader will catch 75% of the errors. P proofreader catches error P(nonword error P proofreader catches error nonword error + P(word

More information

25. EXPLAINING VALIDITYAND RELIABILITY

25. EXPLAINING VALIDITYAND RELIABILITY 25. EXPLAINING VALIDITYAND RELIABILITY "Validity" and "reliability" are ubiquitous terms in social science measurement. They are prominent in the APA "Standards" (1985) and earn chapters in test theory

More information

WHAT IS THE DISSERTATION?

WHAT IS THE DISSERTATION? BRIEF RESEARCH PROPOSAL & DISSERTATION PROCEDURES 2018-2019: (HDIP STUDENTS) Dr Marta Sant Dissertations Coordinator WHAT IS THE DISSERTATION? All HDIP students are required to submit a dissertation with

More information

Method. NeuRA Biofeedback May 2016

Method. NeuRA Biofeedback May 2016 Introduction is a technique in which information about the person s body is fed back to the person so that they may be trained to alter the body s conditions. Physical therapists use biofeedback to help

More information

Job Description. HMP Liverpool Drug and Alcohol Recovery Service. Service User Involvement, Peer Mentor & Volunteer Co-ordinatior.

Job Description. HMP Liverpool Drug and Alcohol Recovery Service. Service User Involvement, Peer Mentor & Volunteer Co-ordinatior. Job Description Service Job Title Hours HMP Liverpool Drug and Alcohol Recovery Service Service User Involvement, Peer Mentor & Volunteer Co-ordinatior. 37.5 (working flexibly over weekends and bank holidays)

More information

Statistical considerations in indirect comparisons and network meta-analysis

Statistical considerations in indirect comparisons and network meta-analysis Statistical considerations in indirect comparisons and network meta-analysis Said Business School, Oxford, UK March 18-19, 2013 Cochrane Comparing Multiple Interventions Methods Group Oxford Training event,

More information

Higher National Unit specification: general information. Graded Unit 1

Higher National Unit specification: general information. Graded Unit 1 Higher National Unit specification: general information This Graded Unit has been validated as part of the HNC/HND Care and Administrative Practice. Centres are required to develop the assessment instrument

More information

Multiple Act criterion:

Multiple Act criterion: Common Features of Trait Theories Generality and Stability of Traits: Trait theorists all use consistencies in an individual s behavior and explain why persons respond in different ways to the same stimulus

More information

Centre for Education Research and Policy

Centre for Education Research and Policy The effect of marker background and training on the quality of marking in GCSE English Michelle Meadows a * and Lucy Billington b a Assessment & Qualifications Alliance, UK; b University of Bristol, UK

More information

PROGRAMME SPECIFICATION UNDERGRADUATE PROGRAMMES

PROGRAMME SPECIFICATION UNDERGRADUATE PROGRAMMES PROGRAMME SPECIFICATION UNDERGRADUATE PROGRAMMES KEY FACTS Programme name Psychology Award BSc (Hons) School School of Arts and Social Sciences Department or equivalent Department of Psychology UCAS Code

More information

What is the Dissertation?

What is the Dissertation? BRIEF RESEARCH PROPOSAL & DISSERTATION PROCEDURES 2017-2018: (HDIP STUDENTS) Dr Marta Sant Dissertations Coordinator What is the Dissertation? All HDIP students are required to submit a dissertation with

More information

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b

Funnelling Used to describe a process of narrowing down of focus within a literature review. So, the writer begins with a broad discussion providing b Accidental sampling A lesser-used term for convenience sampling. Action research An approach that challenges the traditional conception of the researcher as separate from the real world. It is associated

More information

Final Consultation on the Neuropsychologist Scope of Practice: Core Competencies, and a Grand-parenting Pathway to Registration

Final Consultation on the Neuropsychologist Scope of Practice: Core Competencies, and a Grand-parenting Pathway to Registration Final Consultation on the Neuropsychologist Scope of Practice: Core Competencies, and a Grand-parenting Pathway to Registration June 2017 SECTION A: BACKGROUND In December 2015 the Board consulted stakeholders

More information

The First Class Club Coach

The First Class Club Coach Area and Division Governor Training The First Class Club Coach 218F Session Objectives What qualifies a club coach Why a club coach is needed How to be an effective club coach 1 Introduction A club coach:

More information

On indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state

On indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties

More information

On the Construct Validity of an Analytic Rating Scale for Speaking Assessment

On the Construct Validity of an Analytic Rating Scale for Speaking Assessment On the Construct Validity of an Analytic Rating Scale for Speaking Assessment Chunguang Tian 1,2,* 1 Foreign Languages Department, Binzhou University, Binzhou, P.R. China 2 English Education Department,

More information

Designing a Questionnaire

Designing a Questionnaire Designing a Questionnaire What Makes a Good Questionnaire? As a rule of thumb, never to attempt to design a questionnaire! A questionnaire is very easy to design, but a good questionnaire is virtually

More information

A framework for predicting item difficulty in reading tests

A framework for predicting item difficulty in reading tests Australian Council for Educational Research ACEReSearch OECD Programme for International Student Assessment (PISA) National and International Surveys 4-2012 A framework for predicting item difficulty in

More information

Sports Medicine and Sports Rehabilitation courses. Develop and extend best practice in sports medicine and rehabilitation.

Sports Medicine and Sports Rehabilitation courses. Develop and extend best practice in sports medicine and rehabilitation. Sports Medicine and Sports courses Develop and extend best practice in sports medicine and rehabilitation 100% online Overview ONLINE SPORTS MEDICINE AND SPORT REHABILITATION COURSES The University of

More information

STEP II Conceptualising a Research Design

STEP II Conceptualising a Research Design STEP II Conceptualising a Research Design This operational step includes two chapters: Chapter 7: The research design Chapter 8: Selecting a study design CHAPTER 7 The Research Design In this chapter you

More information

Psychologist use statistics for 2 things

Psychologist use statistics for 2 things Psychologist use statistics for 2 things O Summarize the information from the study/experiment O Measures of central tendency O Mean O Median O Mode O Make judgements and decisions about the data O See

More information

Applied Behavior Analysis Medical Necessity Guidelines

Applied Behavior Analysis Medical Necessity Guidelines Provider update Applied Behavior Analysis Medical Necessity Guidelines Summary of change: Effective October 19, 2017, the TennCare policy on Medical Necessity Guidelines for Applied Behavior Analysis (ABA)

More information

The truth about doctors' handwriting: a prospective study BMJ Volume 313, Number 7072 >BMJ 313:1657 (Published 21 December 1996)

The truth about doctors' handwriting: a prospective study BMJ Volume 313, Number 7072 >BMJ 313:1657 (Published 21 December 1996) The truth about doctors' handwriting: a prospective study BMJ Volume 313, Number 7072 >BMJ 313:1657 (Published 21 December 1996) Donald M Berwick, president and chief executive officer a, David E Winickoff,

More information

On the purpose of testing:

On the purpose of testing: Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase

More information

INTRODUCTION. Evidence standards for justifiable evidence claims, June 2016

INTRODUCTION. Evidence standards for justifiable evidence claims, June 2016 EVIDENCE STANDARDS: A DIMENSIONS OF DIFFERENCE FRAMEWORK FOR APPRAISING JUSTIFIABLE EVIDENCE CLAIMS 1 David Gough, EPPI-Centre, SSRU, UCL Institute of Education, University College London INTRODUCTION

More information

Running head: EVALUATING COMPARATIVE JUDGMENT 1. Evaluating Comparative Judgment as an Approach to Essay Scoring. Jeffrey T. Steedle and Steve Ferrara

Running head: EVALUATING COMPARATIVE JUDGMENT 1. Evaluating Comparative Judgment as an Approach to Essay Scoring. Jeffrey T. Steedle and Steve Ferrara Running head: EVALUATING COMPARATIVE JUDGMENT 1 Evaluating Comparative Judgment as an Approach to Essay Scoring Jeffrey T. Steedle and Steve Ferrara Pearson DRAFT: DO NOT CITE OR QUOTE Author Note The

More information

Unit for the Enhancement of Learning and Teaching. Categorical Marking

Unit for the Enhancement of Learning and Teaching. Categorical Marking Unit for the Enhancement of Learning and Teaching Categorical Marking Categorical Marking What is it? Marking Scale from 0 100 with limited marking points in each band Intended to improve consistency and

More information

Development, administration, and validity evidence of a subspecialty preparatory test toward licensure: a pilot study

Development, administration, and validity evidence of a subspecialty preparatory test toward licensure: a pilot study Johnson et al. BMC Medical Education (2018) 18:176 https://doi.org/10.1186/s12909-018-1294-z RESEARCH ARTICLE Open Access Development, administration, and validity evidence of a subspecialty preparatory

More information

POLICY NAME: Spiritual, Moral, Social and Cultural Development STATUS: Recommended DATE OF REVIEW: September 2013

POLICY NAME: Spiritual, Moral, Social and Cultural Development STATUS: Recommended DATE OF REVIEW: September 2013 POLICY NAME: Spiritual, Moral, Social and Cultural Development STATUS: Recommended DATE OF REVIEW: September 2013 1.0 Introduction 1.1 The spiritual, moral, social and cultural development of pupils is

More information

Interface Validity Investigating the potential role of face validity in content validation Gábor Szabó, Robert Märcz ECL Examinations

Interface Validity Investigating the potential role of face validity in content validation Gábor Szabó, Robert Märcz ECL Examinations Interface Validity Investigating the potential role of face validity in content validation Gábor Szabó, Robert Märcz ECL Examinations Outline - Questions of face validity - New approach - Context, participants

More information

Research Methods in Human Computer Interaction by J. Lazar, J.H. Feng and H. Hochheiser (2010)

Research Methods in Human Computer Interaction by J. Lazar, J.H. Feng and H. Hochheiser (2010) Research Methods in Human Computer Interaction by J. Lazar, J.H. Feng and H. Hochheiser (2010) Example test questions PLUS ANSWERS Module Intelligent Interaction Design, 2016-2017 Below are example questions

More information

Applied Research on English Language

Applied Research on English Language Applied Research on English Language V. 6 N. 4 2017 pp: 411-434 http://are.ui.ac.ir DOI: http://dx.doi.org/10.22108/are.2017.106097.1170 Score Generalizability of Writing Assessment: The Effect of Rater

More information