Psychological testing
|
|
- Harvey Jasper Cannon
- 5 years ago
- Views:
Transcription
1 Psychological testing Lecture 12 Mikołaj Winiewski, PhD
2 Test Construction Strategies Content validation Empirical Criterion Factor Analysis Mixed approach (all of the above)
3 Content Validation Defining all aspects of the construct and create test items Derived from theory or based on purpose of the test Content of the item is of the primary importance Consulting experts about the constructs using qualitative methods Employ expert s as judges to assess each potential item using quantitative measures Perform psychometric analyses of items
4 Empirical Keying Create test items to measure one or more traits Derived from theory or based on purpose of the test Content of the item is not of the primary importance Administer test items to a criterion and control group Select items that best distinguish between these two groups
5 Factor Analisys Create test items to measure one or more traits derived from theory Content of the item is of the primary importance Administer test items to appropriate sample derived from population of interest large (depending on technique and base no of items) possibly representative Employ factor analysis (family of correlational techniques used to determine underlying structure of data)
6 Mixed approach Employing mixed strategy For example Defining all aspects of the construct cretin test items Employ expert s as judges to assess each potential item or use experts to create item pool Parallel Employ factor analysis Administer test items to a criterion and control group
7 Adaptation Producing (adjusting existing instruments) instruments that measure target constructs adequately in target cultures Using set of procedures and techniques to create equivalent tool Purpose / application - central issue in adaptations.
8 Main Applications of Translations/Adaptations Comparative Studies (diagnosis & research) Focus: Comparison of construct or mean scores across cultures Strategy: Maximizing comparability Studies in target culture (diagnosis & research) Focus: validity in new context Strategy: Maximizing local suitability
9 Considerations cultural equivalence Psychological theory / dimensions Psychological concepts / terms Behavioral indicators procedures
10 Considerations test equivalence Face equivalence (superficial) Psychometrical statistics, validity and reliability Psychological - functional Translation Construction
11 Adoption / translation Not only language! Literal/close translation: What is the name of the queen of the England? Problem: Item more difficult for American children than for English children Adaptation: What is the name of the president of the USA? Problem: Queen and president are not equally known in their respective countries
12 equivalence Words linguistic Meanings - psychological
13 Linguistic Equivalence (Broader than similarity of words) Linguistic equivalence refers to similarity of linguistic features of a text. Examples of relevant linguistic features are: Lexical similarity Grammatical accuracy In general: emphasis on formal-textual characteristics (cf. automatic translations)
14 Psychological Equivalence Psychological equivalence refers to similarity of (psychological) meaning and scores Similarity in a broad sense: Textual, e.g., Connotation of words, implied context of text Comprehensibility Metrical: Score comparability
15 Relationship between Two Perspectives Three possible relations between linguistic and psychological features, depending on the overlap: a. complete b. partial c. none psych. linguistic Translatable Poorly translatable Essentially non-translatable
16 Cultural adaptation Options / strategies Adoption / transcription (Close literal translation) Advantage: maintains metric equivalence Disadvantage: adequacy (too) readily assumed, should be demonstrated Adaptation translation travesty paraphrase Advantage: more flexible, more tailored to the context Disadvantage: fewer statistical techniques available to compare scores across cultures Assembly (re-assembly) (composing a new instrument) Advantage: very flexible Disadvantage: almost no comparability maintained
17 Adoption / transcription Literal translation of all items Focus: extreme translation fidelity Assumption: universality of constructs and behaviors Pros: metric equivalence possibility of straightforward comparisons Cons: language and psychometric problems
18 Adaptation: translation Faithful translation of original pool of items with possible changes Focus: translation fidelity Assumption: universality of constructs and behaviors, but not language Pros: better psychometric properties better construct and ecological validity Cons: Fewer comparison options Still some language and psychometric problems
19 Adaptation: travesty Free translation of original pool of items keeping meaning and changing language adjusting to language and psychological needs Focus: psychological meaning Assumption: universality of constructs but not language and possible cultural differences in behaviors Pros: better cultural adjustment less metric equivalence but still pretty good better psychometric properties Cons: Few comparison options Major differences between versions of the tests
20 Adaptation: paraphrase Creating new tool using original items as inspiration rather than base Focus: psychological meaning Assumption: universality of constructs but not behaviors and language Pros: good cultural adjustment good psychometric properties cultural equivalence Cons: No metric equivalence Major differences between versions of the tests
21 Assembly (re-assembly) Composing new instrument using original theoretical model and development strategy Focus: adaptation of tool and theory Assumption: no cultural universality of behaviors and language and possible differences in constructs Pros: Best cultural adjustment Cons: No metric equivalence Two different tools
22 Item Analysis
23 Purpose of Item Analysis Evaluates the quality of each item Rationale: the quality of items determines the quality of test (i.e., reliability & validity) May suggest ways of improving the measurement of a test Can help with understanding why certain tests predict some criteria but not others
24 Item Analysis When analyzing the test items, we have several questions about the performance of each item. Some of these questions include: Are the items congruent with the test objectives? Are the items valid? Do they measure what they're supposed to measure? Are the items reliable? Do they measure consistently? How long does it take an examinee to complete each item? What items are most difficult to answer correctly? What items are easy? Are there any poor performing items that need to be discarded?
25 Types of Item Analyses for CTT Three major types: 1. Assess quality of the distractors 2. Assess difficulty of the items 3. Assess how well an item differentiates between high and low performers
26 DISTRACTOR ANALYSIS 1) Question DISTRACTORS A. Multiple-Choice B. Multiple-Choice Correct answer C. Multiple-Choice D. Multiple-Choice
27 Distractor Analysis First question of item analysis: How many people choose each response? If there is only one best response, then all other response options are distractors. Example (N = 35): Which method has the best internal consistency? # a) projective test 1 b) peer ratings 1 c) forced choice 21 d) differences n.s. 12
28 Distractor Analysis A perfect test item would have 2 characteristics: 1. Everyone who knows the item gets it right 2. People who do not know the item will have responses equally distributed across the wrong answers. It is not desirable to have one of the distracters chosen more often than the correct answer. This result indicates a potential problem with the question. This distractor may be too similar to the correct answer and/or there may be something in either the stem or the alternatives that is misleading.
29 Distractor Analysis (cont d) Calculate the # of people expected to choose each of the distractors. If random same expected number for each wrong response (Figure 10-1). # of Persons Exp. To Choose Distractor N answering incorrectly 14 Number of distractors 3 = = 4.7
30 Distractor Analysis (cont d) When the number of persons choosing a distractor significantly exceeds the number expected, there are 2 possibilities: 1. It is possible that the choice reflects partial knowledge 2. The item is a poorly worded trick question unpopular distractors may lower item and test difficulty because it is easily eliminated extremely popular is likely to lower the reliability and validity of the test
31 Item Difficulty Percentage of test takers who respond correctly What if p =.00 What if p = 1.00?
32 Item Difficulty An item with a p value of.0 or 1.0 does not contribute to measuring individual differences and thus is certain to be useless When comparing 2 test scores, we are interested in who had the higher score or the differences in scores p value of.5 have most variation so seek items in this range and remove those with extreme values can also be examined to determine proportion answering in a particular way for items that don t have a correct answer
33 Item Difficulty (cont.) What is the best p-value? most optimal p-value =.50 maximum discrimination between good and poor performers Should we only choose items of.50? When shouldn t we?
34 Item Difficulty (cont.) Should we only choose items of.50? Not necessarily... When wanting to screen the very top group of applicants (i.e., admission to university or medical school). Cutoffs may be much higher Other institutions want a minimum level (i.e., minimum reading level) Cutoffs may be much lower
35 Item Difficulty (cont d) General Rules of Item Difficulty p low (<.20) p moderate ( ) p high (>.80) difficult test item moderately diff. easy item
Technical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationINVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form
INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement
More informationCulture & Survey Measurement. Timothy Johnson Survey Research Laboratory University of Illinois at Chicago
Culture & Survey Measurement Timothy Johnson Survey Research Laboratory University of Illinois at Chicago What is culture? It is the collective programming of the mind which distinguishes the members of
More informationTHE PROFESSIONAL BOARD FOR PSYCHOLOGY HEALTH PROFESSIONS COUNCIL OF SOUTH AFRICA TEST DEVELOPMENT / ADAPTATION PROPOSAL FORM
FORM A THE PROFESSIONAL BOARD FOR PSYCHOLOGY HEALTH PROFESSIONS COUNCIL OF SOUTH AFRICA TEST DEVELOPMENT / ADAPTATION PROPOSAL FORM This document consists of two sections. Please complete section 1 if
More informationEVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS
DePaul University INTRODUCTION TO ITEM ANALYSIS: EVALUATING AND IMPROVING MULTIPLE CHOICE QUESTIONS Ivan Hernandez, PhD OVERVIEW What is Item Analysis? Overview Benefits of Item Analysis Applications Main
More informationITEM ANALYSIS OF MID-TRIMESTER TEST PAPER AND ITS IMPLICATIONS
ITEM ANALYSIS OF MID-TRIMESTER TEST PAPER AND ITS IMPLICATIONS 1 SARITA DESHPANDE, 2 RAVINDRA KUMAR PRAJAPATI 1 Professor of Education, College of Humanities and Education, Fiji National University, Natabua,
More informationInfluences of IRT Item Attributes on Angoff Rater Judgments
Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts
More informationA Broad-Range Tailored Test of Verbal Ability
A Broad-Range Tailored Test of Verbal Ability Frederic M. Lord Educational Testing Service Two parallel forms of a broad-range tailored test of verbal ability have been built. The test is appropriate from
More informationFinnish Sign Language as a Foreign Language (K2)
Finnish Sign Language as a Foreign Language (K2) The composition of studies Basic studies (30 pts) are common to all major and minor students. The scope of subject studies is 50 pts for major students,
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationItem Writing Guide for the National Board for Certification of Hospice and Palliative Nurses
Item Writing Guide for the National Board for Certification of Hospice and Palliative Nurses Presented by Applied Measurement Professionals, Inc. Copyright 2011 by Applied Measurement Professionals, Inc.
More informationA New Approach to Examining Validity
Nov. 2006, Volume 4, No.11 (Serial No.38) US -China Foreign Language, ISSN1539-8080, US A A New Approach to Examining Validity Test-taking Strategy Investigation HE Xue-chun * (Foreign Languages Department,
More informationDevelopment, Standardization and Application of
American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,
More informationBy Hui Bian Office for Faculty Excellence
By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 39 Evaluation of Comparability of Scores and Passing Decisions for Different Item Pools of Computerized Adaptive Examinations
More informationDescription of components in tailored testing
Behavior Research Methods & Instrumentation 1977. Vol. 9 (2).153-157 Description of components in tailored testing WAYNE M. PATIENCE University ofmissouri, Columbia, Missouri 65201 The major purpose of
More informationPsychological testing
Psychological testing Lecture 11 Mikołaj Winiewski, PhD Marcin Zajenkowski, PhD Strategies for test development and test item considerations The procedures involved in item generation, item selection,
More informationPÄIVI KARHU THE THEORY OF MEASUREMENT
PÄIVI KARHU THE THEORY OF MEASUREMENT AGENDA 1. Quality of Measurement a) Validity Definition and Types of validity Assessment of validity Threats of Validity b) Reliability True Score Theory Definition
More informationDAT Next Generation. FAQs
DAT Next Generation FAQs DAT TM Next Generation Frequently Asked Questions What does DAT Next Generation measure? The Differential Aptitude Tests, or DAT for short, are a battery of tests designed to assess
More informationThe Effect of Guessing on Item Reliability
The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring Michael Kane National League for Nursing, Inc. James Moloney State University of New York at Brockport The answer-until-correct
More informationCalifornia Subject Examinations for Teachers
California Subject Examinations for Teachers TEST GUIDE AMERICAN SIGN LANGUAGE SUBTEST III Subtest Description This document contains the World Languages: American Sign Language (ASL) subject matter requirements
More informationTest review. Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., Test description
Archives of Clinical Neuropsychology 19 (2004) 703 708 Test review Comprehensive Trail Making Test (CTMT) By Cecil R. Reynolds. Austin, Texas: PRO-ED, Inc., 2002 1. Test description The Trail Making Test
More informationTest Validity. What is validity? Types of validity IOP 301-T. Content validity. Content-description Criterion-description Construct-identification
What is? IOP 301-T Test Validity It is the accuracy of the measure in reflecting the concept it is supposed to measure. In simple English, the of a test concerns what the test measures and how well it
More informationExamining the Psychometric Properties of The McQuaig Occupational Test
Examining the Psychometric Properties of The McQuaig Occupational Test Prepared for: The McQuaig Institute of Executive Development Ltd., Toronto, Canada Prepared by: Henryk Krajewski, Ph.D., Senior Consultant,
More informationAN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK
AN ANALYSIS ON VALIDITY AND RELIABILITY OF TEST ITEMS IN PRE-NATIONAL EXAMINATION TEST SMPN 14 PONTIANAK Hanny Pradana, Gatot Sutapa, Luwandi Suhartono Sarjana Degree of English Language Education, Teacher
More informationPsychological testing
What is a psychological test Psychological testing An evaluative device or procedure in which a sample of an examinee s behavior in a specified domain is obtained and subsequently evaluated and scored
More informationKeep Wild Animals Wild: Wonderfully Wild!
Animal Action Education English Language Arts Keep Wild Animals Wild: Wonderfully Wild! U.S. Standards Correlation Ages 5 7 Reading Key Ideas and Details CCRA.R.1: Read closely to determine what the text
More informationA framework for predicting item difficulty in reading tests
Australian Council for Educational Research ACEReSearch OECD Programme for International Student Assessment (PISA) National and International Surveys 4-2012 A framework for predicting item difficulty in
More informationValidity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that
Validity refers to the accuracy of a measure. A measurement is valid when it measures what it is suppose to measure and performs the functions that it purports to perform. Does an indicator accurately
More informationLikelihood Ratio Based Computerized Classification Testing. Nathan A. Thompson. Assessment Systems Corporation & University of Cincinnati.
Likelihood Ratio Based Computerized Classification Testing Nathan A. Thompson Assessment Systems Corporation & University of Cincinnati Shungwon Ro Kenexa Abstract An efficient method for making decisions
More informationItem Analysis Explanation
Item Analysis Explanation The item difficulty is the percentage of candidates who answered the question correctly. The recommended range for item difficulty set forth by CASTLE Worldwide, Inc., is between
More informationLecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 1.1-1
Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 1.1-1 Chapter 1 Introduction to Statistics 1-1 Review and Preview 1-2 Statistical Thinking 1-3
More informationCross-validation of easycbm Reading Cut Scores in Washington:
Technical Report # 1109 Cross-validation of easycbm Reading Cut Scores in Washington: 2009-2010 P. Shawn Irvin Bitnara Jasmine Park Daniel Anderson Julie Alonzo Gerald Tindal University of Oregon Published
More informationROC Curve. Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341)
ROC Curve Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342 ROC Curve The ROC Curve procedure provides a useful way to evaluate the performance of classification
More informationLANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors
LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors affecting reliability ON DEFINING RELIABILITY Non-technical
More informationSection 5. Field Test Analyses
Section 5. Field Test Analyses Following the receipt of the final scored file from Measurement Incorporated (MI), the field test analyses were completed. The analysis of the field test data can be broken
More informationProcess of a neuropsychological assessment
Test selection Process of a neuropsychological assessment Gather information Review of information provided by referrer and if possible review of medical records Interview with client and his/her relative
More informationOutline of Part III. SISCR 2016, Module 7, Part III. SISCR Module 7 Part III: Comparing Two Risk Models
SISCR Module 7 Part III: Comparing Two Risk Models Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington Outline of Part III 1. How to compare two risk models 2.
More informationSUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing
Categorical Speech Representation in the Human Superior Temporal Gyrus Edward F. Chang, Jochem W. Rieger, Keith D. Johnson, Mitchel S. Berger, Nicholas M. Barbaro, Robert T. Knight SUPPLEMENTARY INFORMATION
More informationNEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS
NEW YORK STATE TEACHER CERTIFICATION EXAMINATIONS TEST DESIGN AND FRAMEWORK April 2016 Authorized for Distribution by the New York State Education Department This test design and framework document is
More informationDATA GATHERING METHOD
DATA GATHERING METHOD Dr. Sevil Hakimi Msm. PhD. THE NECESSITY OF INSTRUMENTS DEVELOPMENT Good researches in health sciences depends on good measurement. The foundation of all rigorous research designs
More informationImportance of Good Measurement
Importance of Good Measurement Technical Adequacy of Assessments: Validity and Reliability Dr. K. A. Korb University of Jos The conclusions in a study are only as good as the data that is collected. The
More informationʻThe concept of Deaf identity in Sloveniaʼ
28.09.2013 ʻThe concept of Deaf identity in Sloveniaʼ DAMJANA KOGOVŠEK Faculty of Education, University of Ljubljana, Slovenia damjana.kogovsek@pef.uni-lj.si COLLABORATION with and MANY THANKS to The Ljubljana
More informationHandout 5: Establishing the Validity of a Survey Instrument
In this handout, we will discuss different types of and methods for establishing validity. Recall that this concept was defined in Handout 3 as follows. Definition Validity This is the extent to which
More informationGezinskenmerken: De constructie van de Vragenlijst Gezinskenmerken (VGK) Klijn, W.J.L.
UvA-DARE (Digital Academic Repository) Gezinskenmerken: De constructie van de Vragenlijst Gezinskenmerken (VGK) Klijn, W.J.L. Link to publication Citation for published version (APA): Klijn, W. J. L. (2013).
More informationPublishing Your Study: Tips for Young Investigators. Learning Objectives 7/9/2013. Eric B. Bass, MD, MPH
Publishing Your Study: Tips for Young Investigators Eric B. Bass, MD, MPH Learning Objectives To apply a logical approach to organizing & presenting your work in a manuscript To recognize the importance
More informationChapter 6. Methods of Measuring Behavior Pearson Prentice Hall, Salkind. 1
Chapter 6 Methods of Measuring Behavior 2009 Pearson Prentice Hall, Salkind. 1 CHAPTER OVERVIEW Tests and Their Development Types of Tests Observational Techniques Questionnaires 2009 Pearson Prentice
More informationBruno D. Zumbo, Ph.D. University of Northern British Columbia
Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.
More informationOn the diversity principle and local falsifiability
On the diversity principle and local falsifiability Uriel Feige October 22, 2012 1 Introduction This manuscript concerns the methodology of evaluating one particular aspect of TCS (theoretical computer
More informationUsing the Rasch Modeling for psychometrics examination of food security and acculturation surveys
Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,
More information32.5. percent of U.S. manufacturers experiencing unfair currency manipulation in the trade practices of other countries.
TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 11 Measurement Lecture
More informationChoose an approach for your research problem
Choose an approach for your research problem This course is about doing empirical research with experiments, so your general approach to research has already been chosen by your professor. It s important
More informationTesting & Assessment Techniques
Testing & Assessment Techniques Dayo Odukoya dayoodukoya@gmail.com 09096505735 Content/Objectives Concepts of Test and Assessment Types of Tests Test Development & Standardization Test Blueprint Item Generation
More informationHow Many Options do Multiple-Choice Questions Really Have?
How Many Options do Multiple-Choice Questions Really Have? ABSTRACT One of the major difficulties perhaps the major difficulty in composing multiple-choice questions is the writing of distractors, i.e.,
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH
ORIGINS AND DISCUSSION OF EMERGENETICS RESEARCH The following document provides background information on the research and development of the Emergenetics Profile instrument. Emergenetics Defined 1. Emergenetics
More informationSISCR Module 4 Part III: Comparing Two Risk Models. Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington
SISCR Module 4 Part III: Comparing Two Risk Models Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington Outline of Part III 1. How to compare two risk models 2.
More informationCh. 11 Measurement. Paul I-Hai Lin, Professor A Core Course for M.S. Technology Purdue University Fort Wayne Campus
TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 11 Measurement Lecture
More informationUsing Analytical and Psychometric Tools in Medium- and High-Stakes Environments
Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session
More informationU.S. Standards Correlation Young Reader Grades 3 5
Primary Level - Grades 3-5 Animal Action Education Cats, Dogs, and Us English Language Arts Reading Cats, Dogs, and Us U.S. Standards Correlation Young Reader Grades 3 5 Key Ideas and Details 1. Read closely
More informationAssessing risk of bias
Assessing risk of bias Norwegian Research School for Global Health Atle Fretheim Research Director, Norwegian Institute of Public Health Professor II, Uiniversity of Oslo Goal for the day We all have an
More informationBuilding Evaluation Scales for NLP using Item Response Theory
Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged
More informationHoughton Mifflin Harcourt. Participant s Guide Distractor Rationales Fall 2012 User s Conference By Christina Fritz
Houghton Mifflin Harcourt Participant s Guide Distractor Rationales Fall 2012 User s Conference By Christina Fritz Topics for Discussion High Quality Items Anatomy of a Multiple Choice Item Types of Distractors
More informationChapter 3 Tools for Practical Theorizing: Theoretical Maps and Ecosystem Maps
Chapter 3 Tools for Practical Theorizing: Theoretical Maps and Ecosystem Maps Chapter Outline I. Introduction A. Understanding theoretical languages requires universal translators 1. Theoretical maps identify
More informationIntroduction: Speaker. Introduction: Buros. Buros & Education. Introduction: Participants. Goal 10/5/2012
Introduction: Speaker PhD in Educational Measurement University of Nebraska-Lincoln October 28, 2012 CRITICAL TESTING AND MEASUREMENT CONCEPTS: ASSESSMENT PROFESSIONALS 13 years experience HE Assessment
More informationAdaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida
Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models
More informationVariables in Research. What We Will Cover in This Section. What Does Variable Mean?
Variables in Research 9/20/2005 P767 Variables in Research 1 What We Will Cover in This Section Nature of variables. Measuring variables. Reliability. Validity. Measurement Modes. Issues. 9/20/2005 P767
More informationComputerized Mastery Testing
Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating
More informationChapter Three: Sampling Methods
Chapter Three: Sampling Methods The idea of this chapter is to make sure that you address sampling issues - even though you may be conducting an action research project and your sample is "defined" by
More informationPrinciples of Sociology
Principles of Sociology DEPARTMENT OF ECONOMICS ATHENS UNIVERSITY OF ECONOMICS AND BUSINESS [Academic year 2017/18, FALL SEMESTER] Lecturer: Dimitris Lallas Principles of Sociology 4th Session Sociological
More information2016 Technical Report National Board Dental Hygiene Examination
2016 Technical Report National Board Dental Hygiene Examination 2017 Joint Commission on National Dental Examinations All rights reserved. 211 East Chicago Avenue Chicago, Illinois 60611-2637 800.232.1694
More informationDifferential Item Functioning
Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item
More informationCh. 11 Measurement. Measurement
TECH 646 Analysis of Research in Industry and Technology PART III The Sources and Collection of data: Measurement, Measurement Scales, Questionnaires & Instruments, Sampling Ch. 11 Measurement Lecture
More informationReliability and Validity checks S-005
Reliability and Validity checks S-005 Checking on reliability of the data we collect Compare over time (test-retest) Item analysis Internal consistency Inter-rater agreement Compare over time Test-Retest
More informationPsychometrics for Beginners. Lawrence J. Fabrey, PhD Applied Measurement Professionals
Psychometrics for Beginners Lawrence J. Fabrey, PhD Applied Measurement Professionals Learning Objectives Identify key NCCA Accreditation requirements Identify two underlying models of measurement Describe
More informationBayesian Tailored Testing and the Influence
Bayesian Tailored Testing and the Influence of Item Bank Characteristics Carl J. Jensema Gallaudet College Owen s (1969) Bayesian tailored testing method is introduced along with a brief review of its
More informationAP STATISTICS 2008 SCORING GUIDELINES (Form B)
AP STATISTICS 2008 SCING GUIDELINES (Form B) Question 2 Intent of Question The primary goals of this question were to assess a student s ability to (1) recognize an unbiased estimator and explain why the
More informationWork, Employment, and Industrial Relations Theory Spring 2008
MIT OpenCourseWare http://ocw.mit.edu 15.676 Work, Employment, and Industrial Relations Theory Spring 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.
More informationOn the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA
On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA MASARY K UNIVERSITY, CZECH REPUBLIC Overview Background and research aims Focus on RQ2 Introduction
More informationCHAPTER III METHODOLOGY
CHAPTER III METHODOLOGY This chapter discusses things related to the way this study is conducted. Research design, data collection methods which consist of population, sample and setting, research instruments,
More informationMeasurement Issues in Concussion Testing
EVIDENCE-BASED MEDICINE Michael G. Dolan, MA, ATC, CSCS, Column Editor Measurement Issues in Concussion Testing Brian G. Ragan, PhD, ATC University of Northern Iowa Minsoo Kang, PhD Middle Tennessee State
More informationModels in Educational Measurement
Models in Educational Measurement Jan-Eric Gustafsson Department of Education and Special Education University of Gothenburg Background Measurement in education and psychology has increasingly come to
More informationINSPECT Overview and FAQs
WWW.KEYDATASYS.COM ContactUs@KeyDataSys.com 951.245.0828 Table of Contents INSPECT Overview...3 What Comes with INSPECT?....4 Reliability and Validity of the INSPECT Item Bank. 5 The INSPECT Item Process......6
More informationItem Analysis for Beginners
Item Analysis for Beginners John Kleeman Questionmark Executive Director and Founder All rights reserved. Questionmark is a registered trademark of Questionmark Computing Limited. All other trademarks
More informationInterpretation in neuropsychological assessment
Interpretation in neuropsychological assessment What does interpretation of a neuropsychological test involve? What you need to consider in interpretation? hat do we mean by nterpreta1on? Distinction between
More informationSmall Group Presentations
Admin Assignment 1 due next Tuesday at 3pm in the Psychology course centre. Matrix Quiz during the first hour of next lecture. Assignment 2 due 13 May at 10am. I will upload and distribute these at the
More informationNon-profit education, research and support network offers money in exchange for missing science
Alive & Well $50,000 Fact Finder Award Find One Study, Save Countless Lives Non-profit education, research and support network offers money in exchange for missing science http://www.aliveandwell.org Tel
More informationMultiple Act criterion:
Common Features of Trait Theories Generality and Stability of Traits: Trait theorists all use consistencies in an individual s behavior and explain why persons respond in different ways to the same stimulus
More informationSpecial guidelines for preparation and quality approval of reviews in the form of reference documents in the field of occupational diseases
Special guidelines for preparation and quality approval of reviews in the form of reference documents in the field of occupational diseases November 2010 (1 st July 2016: The National Board of Industrial
More informationCompetency Rubric Bank for the Sciences (CRBS)
Competency Rubric Bank for the Sciences (CRBS) Content Knowledge 1 Content Knowledge: Accuracy of scientific understanding Higher Order Cognitive Skills (HOCS) 3 Analysis: Clarity of Research Question
More informationA Comparison of Several Goodness-of-Fit Statistics
A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures
More informationReliability AND Validity. Fact checking your instrument
Reliability AND Validity Fact checking your instrument General Principles Clearly Identify the Construct of Interest Use Multiple Items Use One or More Reverse Scored Items Use a Consistent Response Format
More informationSLEEP DISTURBANCE ABOUT SLEEP DISTURBANCE INTRODUCTION TO ASSESSMENT OPTIONS. 6/27/2018 PROMIS Sleep Disturbance Page 1
SLEEP DISTURBANCE A brief guide to the PROMIS Sleep Disturbance instruments: ADULT PROMIS Item Bank v1.0 Sleep Disturbance PROMIS Short Form v1.0 Sleep Disturbance 4a PROMIS Short Form v1.0 Sleep Disturbance
More informationMaking a psychometric. Dr Benjamin Cowan- Lecture 9
Making a psychometric Dr Benjamin Cowan- Lecture 9 What this lecture will cover What is a questionnaire? Development of questionnaires Item development Scale options Scale reliability & validity Factor
More informationCochrane Pregnancy and Childbirth Group Methodological Guidelines
Cochrane Pregnancy and Childbirth Group Methodological Guidelines [Prepared by Simon Gates: July 2009, updated July 2012] These guidelines are intended to aid quality and consistency across the reviews
More informationSession 401. Creating Assessments that Effectively Measure Knowledge, Skill, and Ability
Practical and Effective Assessments in e-learning Session 401 Creating Assessments that Effectively Measure Knowledge, Skill, and Ability Howard Eisenberg, Questionmark Effectively Measuring Knowledge,
More informationPsych 1Chapter 2 Overview
Psych 1Chapter 2 Overview After studying this chapter, you should be able to answer the following questions: 1) What are five characteristics of an ideal scientist? 2) What are the defining elements of
More informationSurvey Question. What are appropriate methods to reaffirm the fairness, validity reliability and general performance of examinations?
Clause 9.3.5 Appropriate methodology and procedures (e.g. collecting and maintaining statistical data) shall be documented and implemented in order to affirm, at justified defined intervals, the fairness,
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More information