Psychometrics in context: Test Construction with IRT. Professor John Rust University of Cambridge
|
|
- Miles Tobias Carter
- 5 years ago
- Views:
Transcription
1 Psychometrics in context: Test Construction with IRT Professor John Rust University of Cambridge
2 Plan Guttman scaling Guttman errors and Loevinger s H statistic Non-parametric IRT Traces in Stata Parametric IRT IRT in Mplus
3 Guttman Scaling Guttman, L. (1944) A basis for scaling qualitative data. American Sociological Review, 9, Binary items are ranked in some order (E.g. Difficulty) Agreement with an item implies agreement with items of a lower order. E.g. No need to find out whether a weightlifter who lifts 200 Kg can also lift 150 Kg one who multiplies 453 by 234 can multiply 6 by 3. For a rational respondent a single index on the scale RRRRRRRRRRWWWWWWWW AAAAAADDDDDDDDDDDDDDD
4 Guttman Errors In practice more likely to be: RRRRRRWRRWWRWWWWWWWWW Those in bold are described as Guttman errors Hence we need the notion of probability A respondent answering an item positively will have a significantly greater probability of also having answered less difficult items positively
5 Loevinger Loevinger, J. (1948) The technic of homogeneous tests compared with some aspects of scale analysis and factor analysis, Psychological Bulletin, 19(4) Loevinger s H statistic measures extent to which items appear in the same relative order Based on comparison of actual Guttman errors (A) to number expected if responses are random (R). E.g. RRRRRRRRRWRRWWRWWWWWWWWWWWW (A) RWRWRRWWRWRRWWWRWRWWRRRWRWRW (R)
6 Loevinger H LoevH is a function of (Random Actual )/Random if there are no errors, then R =1 If number of errors is as expected by chance alone, then R = 0 Can be calculated for each pair of items Can be averaged across all respondents to give an index for a particular item Can be averaged across all items to gives an index for the test
7 Criteria for Loevinger s H in a good scale (Mokken) The usually accepted (but somewhat arbitrary) criteria is that H should be greater than 0.3 for each item, and that H should be greater than 0.3 for the scale as a whole If H is > 0.5 : A strong Mokken Scale If H is > 0.4 and < 0.5: a moderate scale If H > 0.3 and < 0.4 : a weak scale
8 Mokken Mokken, R. J. (1971) A theory and procedure of scale analysis. De Gruyter: Netherlands Criteria for a good scale based on traces A trace is a plot the probability of agreement with an item against the total score (number correct) The probability of a positive response to an item should increase monotonically as the latent trait increases Double monotony must not exist. (I.e. The trace lines of items in a Mokken scale should not intersect.)
9 Items in Short GRIMS My partner is sensitive to and aware of my needs. (P) My partner doesn t listen to me any more. (N) I m sometimes lonely when I m with my partner. (N) Our relationship is full of joy and excitement. (P) I wish there was more warmth and affection between us. (N) I suspect we are on the brink of separation. (N) We can make up quickly after an argument. (P)
10 Run Stata Item analysis using the Mokken procedure in Stata Stata is available on the Public Workstation Facilities (PWFs). E.g. in the Cathy Marsh Room It contains routines traces, LoevH and msp First import data into Stata from SPSS
11 0 Rate of positive response Example of a trace Trace of the item BItem6 as a function of the score Total score
12 Linear prediction What do we need to know about an item in order to predict the probability of a person of known ability getting the item right. Linear prediction: y = α + βx
13 Linear prediction in scattergram
14 True/False (binary) data Classical scattergram doesn t show much 14
15 Item Response Theory (IRT) Arose from the need to link the behaviour of binary items to the scale non-linearly. Devised independently by Georg Rasch Lord, Novik and Birnbaum Plots the probability of getting an item right (for each item) against a latent trait of ability (or personality) now called θ (theta)
16 Item Characteristic Curve (ICC) 16
17 IRT Models Predicting probability from ability One parameter or Rasch model Three parameter model Two parameter model 17
18 Example 1
19 Example 2
20 Example 3
21 Example 4
22 Difficulty Parameter
23 Discrimination Parameter
24 Guessing Parameter
25 Item Characteristic Curve P(θ)=c + (1-c)/(1 + e -a(θ-b) ). Where θ = ability parameter (a person s ability) P(θ) = Probability of correct response given θ a = discrimination parameter b = difficulty parameter c = guessing parameter e = growth constant ( ) 25
26 Run Mplus Item analysis using Confirmatory Factor Analysis in Mplus Mplus is available on the Public Workstation Facilities (PWFs). E.g. in the Cathy Marsh Room It is suitable for modelling of binary and ordinal as well as continuous data Download demo version (6 items only) from Statmodel.com
27 IRT in Mplus Show modelling with exploratory factor analysis (EFA) Repeat with Confirmatory Factor Analysis (CFA) Note that CFA with binary data is IRT Shot ICC output with plots Demonstrate information function (reliability differs at different points of the scale)
28 Rehash of classical item analysis Take a set of items Find their difficulty values Find their discrimination indices Eliminate items that don t meet certain criteria Extreme values Poor correlation with scale itself Maximize internal consistency
29 Comparison of IRT with CTT in test construction IRT searches for hierarchies rather than correlations. In IRT, person scores and item difficulties are plotted on the same scale (theta) IRT makes use of item thresholds to incorporate item difficulty into the score (important in test equating) IRT does not assume linearity, hence IRT works with binary or ordinal data Some IRT models (eg Rasch, Mokken) require double monotonicity. In CTT, item discrimination indices only have to be above certain criteria.
30 Advantages of IRT Information function allows tests to be optimised at thresholds Reliability can be more accurately related to test score Individual reliabilities for each person Basis for Test equating Adaptive Testing
31 Next week: Measuring intelligence What is intelligence? IQ testing Controversies in intelligence testing Eugenics Multiple intelligences The Flynn Effect Intelligence testing today
Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationOn indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state
On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More informationITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION SCALE
California State University, San Bernardino CSUSB ScholarWorks Electronic Theses, Projects, and Dissertations Office of Graduate Studies 6-2016 ITEM RESPONSE THEORY ANALYSIS OF THE TOP LEADERSHIP DIRECTION
More informationThe Psychometric Principles Maximizing the quality of assessment
Summer School 2009 Psychometric Principles Professor John Rust University of Cambridge The Psychometric Principles Maximizing the quality of assessment Reliability Validity Standardisation Equivalence
More informationIntroduction to Test Theory & Historical Perspectives
Introduction to Test Theory & Historical Perspectives Measurement Methods in Psychological Research Lecture 2 02/06/2007 01/31/2006 Today s Lecture General introduction to test theory/what we will cover
More informationEvaluating the quality of analytic ratings with Mokken scaling
Psychological Test and Assessment Modeling, Volume 57, 2015 (3), 423-444 Evaluating the quality of analytic ratings with Mokken scaling Stefanie A. Wind 1 Abstract Greatly influenced by the work of Rasch
More informationConnexion of Item Response Theory to Decision Making in Chess. Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan
Connexion of Item Response Theory to Decision Making in Chess Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan Acknowledgement A few Slides have been taken from the following presentation
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationDoes factor indeterminacy matter in multi-dimensional item response theory?
ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional
More informationvalidscale: A Stata module to validate subjective measurement scales using Classical Test Theory
: A Stata module to validate subjective measurement scales using Classical Test Theory Bastien Perrot, Emmanuelle Bataille, Jean-Benoit Hardouin UMR INSERM U1246 - SPHERE "methods in Patient-centered outcomes
More informationThe Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory
The Psychometric Development Process of Recovery Measures and Markers: Classical Test Theory and Item Response Theory Kate DeRoche, M.A. Mental Health Center of Denver Antonio Olmos, Ph.D. Mental Health
More informationIntroduction to Item Response Theory
Introduction to Item Response Theory Prof John Rust, j.rust@jbs.cam.ac.uk David Stillwell, ds617@cam.ac.uk Aiden Loe, bsl28@cam.ac.uk Luning Sun, ls523@cam.ac.uk www.psychometrics.cam.ac.uk Goals Build
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationConfirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children
Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Dr. KAMALPREET RAKHRA MD MPH PhD(Candidate) No conflict of interest Child Behavioural Check
More informationItem Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses
Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,
More informationTurning Output of Item Response Theory Data Analysis into Graphs with R
Overview Turning Output of Item Response Theory Data Analysis into Graphs with R Motivation Importance of graphing data Graphical methods for item response theory Why R? Two examples Ching-Fan Sheu, Cheng-Te
More informationReferences. Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. Mahwah,
The Western Aphasia Battery (WAB) (Kertesz, 1982) is used to classify aphasia by classical type, measure overall severity, and measure change over time. Despite its near-ubiquitousness, it has significant
More informationRasch Versus Birnbaum: New Arguments in an Old Debate
White Paper Rasch Versus Birnbaum: by John Richard Bergan, Ph.D. ATI TM 6700 E. Speedway Boulevard Tucson, Arizona 85710 Phone: 520.323.9033 Fax: 520.323.9139 Copyright 2013. All rights reserved. Galileo
More information[3] Coombs, C.H., 1964, A theory of data, New York: Wiley.
Bibliography [1] Birnbaum, A., 1968, Some latent trait models and their use in inferring an examinee s ability, In F.M. Lord & M.R. Novick (Eds.), Statistical theories of mental test scores (pp. 397-479),
More informationON TESTING AND ESTIMATION IN THE ECONOMIC MEASUREMENT, WHEN USING ITEM RESPONSE MODELS
ON TESTING AND ESTIMATION IN THE ECONOMIC MEASUREMENT, WHEN USING ITEM RESPONSE MODELS RĂILEANU SZELES MONICA Department of Finance, Accounting and Economic Theory Transylvania University of Brasov, Romania
More informationNonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia
Nonparametric DIF Nonparametric IRT Methodology For Detecting DIF In Moderate-To-Small Scale Measurement: Operating Characteristics And A Comparison With The Mantel Haenszel Bruno D. Zumbo and Petronilla
More informationNonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref)
Qual Life Res (2008) 17:275 290 DOI 10.1007/s11136-007-9281-6 Nonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref)
More informationExploring rater errors and systematic biases using adjacent-categories Mokken models
Psychological Test and Assessment Modeling, Volume 59, 2017 (4), 493-515 Exploring rater errors and systematic biases using adjacent-categories Mokken models Stefanie A. Wind 1 & George Engelhard, Jr.
More informationRunning head: THE DEVELOPMENT AND PILOTING OF AN ONLINE IQ TEST. The Development and Piloting of an Online IQ Test. Examination number:
1 Running head: THE DEVELOPMENT AND PILOTING OF AN ONLINE IQ TEST The Development and Piloting of an Online IQ Test Examination number: Sidney Sussex College, University of Cambridge The development and
More informationKnowledge as a driver of public perceptions about climate change reassessed
1. Method and measures 1.1 Sample Knowledge as a driver of public perceptions about climate change reassessed In the cross-country study, the age of the participants ranged between 20 and 79 years, with
More informationIssues That Should Not Be Overlooked in the Dominance Versus Ideal Point Controversy
Industrial and Organizational Psychology, 3 (2010), 489 493. Copyright 2010 Society for Industrial and Organizational Psychology. 1754-9426/10 Issues That Should Not Be Overlooked in the Dominance Versus
More informationABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING. John Michael Clark III
ABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING BY John Michael Clark III Submitted to the graduate degree program in Psychology and
More informationPsychometrics for Beginners. Lawrence J. Fabrey, PhD Applied Measurement Professionals
Psychometrics for Beginners Lawrence J. Fabrey, PhD Applied Measurement Professionals Learning Objectives Identify key NCCA Accreditation requirements Identify two underlying models of measurement Describe
More informationStructural Equation Modeling (SEM)
Structural Equation Modeling (SEM) Today s topics The Big Picture of SEM What to do (and what NOT to do) when SEM breaks for you Single indicator (ASU) models Parceling indicators Using single factor scores
More informationApplication of Logistic Regression Model in Physics Education
Application of Logistic Regression Model in Physics Education Shobha Kanta Lamichhane Tribhuvan University, Prithwi Narayan Campus, Pokhara, Nepal sklamichhane@hotmail.com Abstract This paper introduces
More informationJan Stochl 1,2*, Peter B Jones 1,2 and Tim J Croudace 1
Stochl et al. BMC Medical Research Methodology 2012, 12:74 CORRESPONDENCE Open Access Mokken scale analysis of mental health and well-being questionnaire item responses: a non-parametric IRT method in
More informationMeasurement issues in the use of rating scale instruments in learning environment research
Cav07156 Measurement issues in the use of rating scale instruments in learning environment research Associate Professor Robert Cavanagh (PhD) Curtin University of Technology Perth, Western Australia Address
More informationInvestigating invariant item ordering in the Mental Health Inventory: An illustration of the use of
1 Investigating invariant item ordering in the Mental Health Inventory: An illustration of the use of different methods Roger Watson a, * Wenru Wang b David R Thompson c Rob R Meijer d a The University
More informationA Bayesian Nonparametric Model Fit statistic of Item Response Models
A Bayesian Nonparametric Model Fit statistic of Item Response Models Purpose As more and more states move to use the computer adaptive test for their assessments, item response theory (IRT) has been widely
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More informationData Analysis Using Item Response Theory Methodology: An Introduction to Selected Programs and Applications.
The University of Maine DigitalCommons@UMaine Psychology Faculty Scholarship Psychology 7-2-2012 Data Analysis Using Item Response Theory Methodology: An Introduction to Selected Programs and Applications.
More informationMeasuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University
Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure Rob Cavanagh Len Sparrow Curtin University R.Cavanagh@curtin.edu.au Abstract The study sought to measure mathematics anxiety
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationScale Building with Confirmatory Factor Analysis
Scale Building with Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #7 February 27, 2013 PSYC 948: Lecture #7 Today s Class Scale building with confirmatory
More informationBuilding Evaluation Scales for NLP using Item Response Theory
Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged
More informationBy Hui Bian Office for Faculty Excellence
By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys
More information26. THE STEPS TO MEASUREMENT
26. THE STEPS TO MEASUREMENT Everyone who studies measurement encounters Stevens's levels (Stevens, 1957). A few authors critique his point ofview, but most accept his propositions without deliberation.
More informationImpact and adjustment of selection bias. in the assessment of measurement equivalence
Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch,
More informationMeasurement Invariance (MI): a general overview
Measurement Invariance (MI): a general overview Eric Duku Offord Centre for Child Studies 21 January 2015 Plan Background What is Measurement Invariance Methodology to test MI Challenges with post-hoc
More informationThe MHSIP: A Tale of Three Centers
The MHSIP: A Tale of Three Centers P. Antonio Olmos-Gallo, Ph.D. Kathryn DeRoche, M.A. Mental Health Center of Denver Richard Swanson, Ph.D., J.D. Aurora Research Institute John Mahalik, Ph.D., M.P.A.
More informationAnalyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia
Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia 1 Introduction The Teacher Test-English (TT-E) is administered by the NCA
More informationPaul Irwing, Manchester Business School
Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human
More informationBruno D. Zumbo, Ph.D. University of Northern British Columbia
Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.
More informationInfluences of IRT Item Attributes on Angoff Rater Judgments
Influences of IRT Item Attributes on Angoff Rater Judgments Christian Jones, M.A. CPS Human Resource Services Greg Hurt!, Ph.D. CSUS, Sacramento Angoff Method Assemble a panel of subject matter experts
More informationAssessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University.
Running head: ASSESS MEASUREMENT INVARIANCE Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies Xiaowen Zhu Xi an Jiaotong University Yanjie Bian Xi an Jiaotong
More informationThe Modification of Dichotomous and Polytomous Item Response Theory to Structural Equation Modeling Analysis
Canadian Social Science Vol. 8, No. 5, 2012, pp. 71-78 DOI:10.3968/j.css.1923669720120805.1148 ISSN 1712-8056[Print] ISSN 1923-6697[Online] www.cscanada.net www.cscanada.org The Modification of Dichotomous
More informationChapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE
Chapter 2 Norms and Basic Statistics for Testing MULTIPLE CHOICE 1. When you assert that it is improbable that the mean intelligence test score of a particular group is 100, you are using. a. descriptive
More informationDevelopment, Standardization and Application of
American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,
More informationAuthor s response to reviews
Author s response to reviews Title: The validity of a professional competence tool for physiotherapy students in simulationbased clinical education: a Rasch analysis Authors: Belinda Judd (belinda.judd@sydney.edu.au)
More informationScoring Multiple Choice Items: A Comparison of IRT and Classical Polytomous and Dichotomous Methods
James Madison University JMU Scholarly Commons Department of Graduate Psychology - Faculty Scholarship Department of Graduate Psychology 3-008 Scoring Multiple Choice Items: A Comparison of IRT and Classical
More informationCONSTRUCTION OF THE MEASUREMENT SCALE FOR CONSUMER S ATTITUDES IN THE FRAME OF ONE-PARAMETRIC RASCH MODEL
ACTA UNIVERSITATIS LODZIENSIS FOLIA OECONOMICA 286, 2013 * CONSTRUCTION OF THE MEASUREMENT SCALE FOR CONSUMER S ATTITUDES IN THE FRAME OF ONE-PARAMETRIC RASCH MODEL Abstract. The article discusses issues
More informationLatent Trait Standardization of the Benzodiazepine Dependence. Self-Report Questionnaire using the Rasch Scaling Model
Chapter 7 Latent Trait Standardization of the Benzodiazepine Dependence Self-Report Questionnaire using the Rasch Scaling Model C.C. Kan 1, A.H.G.S. van der Ven 2, M.H.M. Breteler 3 and F.G. Zitman 1 1
More informationFirst of two parts Joseph Hogan Brown University and AMPATH
First of two parts Joseph Hogan Brown University and AMPATH Overview What is regression? Does regression have to be linear? Case study: Modeling the relationship between weight and CD4 count Exploratory
More informationReanalysis of the 1980 AFQT Data from the NLSY79 1
Reanalysis of the 1980 AFQT Data from the NLSY79 1 Pamela Ing Carole A. Lunney Randall J. Olsen Center for Human Resource Research, Ohio State University PART I. FACTOR ANALYSIS Motivation: One of the
More informationType I Error Rates and Power Estimates for Several Item Response Theory Fit Indices
Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2009 Type I Error Rates and Power Estimates for Several Item Response Theory Fit Indices Bradley R. Schlessman
More informationMeasures. David Black, Ph.D. Pediatric and Developmental. Introduction to the Principles and Practice of Clinical Research
Introduction to the Principles and Practice of Clinical Research Measures David Black, Ph.D. Pediatric and Developmental Neuroscience, NIMH With thanks to Audrey Thurm Daniel Pine With thanks to Audrey
More informationViolations of local stochastic independence exaggerate scalability in Mokken scaling analysis of the Chinese Mandarin SF-36
Watson et al. Health and Quality of Life Outcomes 2014, 12:149 RESEARCH Open Access Violations of local stochastic independence exaggerate scalability in Mokken scaling analysis of the Chinese Mandarin
More informationA Course in Item Response Theory and Modeling with Stata
A Course in Item Response Theory and Modeling with Stata Tenko Raykov Michigan State University George A. Marcoulides University of California, Santa Barbara A Stata Press Publication StataCorp LLC College
More informationManifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses
Journal of Modern Applied Statistical Methods Copyright 2005 JMASM, Inc. May, 2005, Vol. 4, No.1, 275-282 1538 9472/05/$95.00 Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement
More informationAn Investigation of Ordinal True Score Test Theory
An Investigation of Ordinal True Score Test Theory John R. Donoghue, Educational Testing Service Norman Cliff, University of Southern California The validity of the assumptions underlying Cliff s (1989)
More informationOn Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015
On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:
More informationINVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form
INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement
More informationEffects of the Number of Response Categories on Rating Scales
NUMBER OF RESPONSE CATEGORIES 1 Effects of the Number of Response Categories on Rating Scales Roundtable presented at the annual conference of the American Educational Research Association, Vancouver,
More informationLatent Variable Modeling - PUBH Latent variable measurement models and path analysis
Latent Variable Modeling - PUBH 7435 Improved Name: Latent variable measurement models and path analysis Slide 9:45 - :00 Tuesday and Thursday Fall 2006 Melanie M. Wall Division of Biostatistics School
More informationConfirmatory Factor Analysis. Professor Patrick Sturgis
Confirmatory Factor Analysis Professor Patrick Sturgis Plan Measuring concepts using latent variables Exploratory Factor Analysis (EFA) Confirmatory Factor Analysis (CFA) Fixing the scale of latent variables
More informationTECHNICAL REPORT. The Added Value of Multidimensional IRT Models. Robert D. Gibbons, Jason C. Immekus, and R. Darrell Bock
1 TECHNICAL REPORT The Added Value of Multidimensional IRT Models Robert D. Gibbons, Jason C. Immekus, and R. Darrell Bock Center for Health Statistics, University of Illinois at Chicago Corresponding
More informationItem Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
International Journal of Scientific Research in Education, SEPTEMBER 2018, Vol. 11(3B), 627-635. Item Response Theory (IRT): A Modern Statistical Theory for Solving Measurement Problem in 21st Century
More information1. Evaluate the methodological quality of a study with the COSMIN checklist
Answers 1. Evaluate the methodological quality of a study with the COSMIN checklist We follow the four steps as presented in Table 9.2. Step 1: The following measurement properties are evaluated in the
More informationAdaptive EAP Estimation of Ability
Adaptive EAP Estimation of Ability in a Microcomputer Environment R. Darrell Bock University of Chicago Robert J. Mislevy National Opinion Research Center Expected a posteriori (EAP) estimation of ability,
More informationA DIFFERENTIAL RESPONSE FUNCTIONING FRAMEWORK FOR UNDERSTANDING ITEM, BUNDLE, AND TEST BIAS ROBERT PHILIP SIDNEY CHALMERS
A DIFFERENTIAL RESPONSE FUNCTIONING FRAMEWORK FOR UNDERSTANDING ITEM, BUNDLE, AND TEST BIAS ROBERT PHILIP SIDNEY CHALMERS A DISSERTATION SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILMENT
More informationASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT S MATHEMATICS ACHIEVEMENT IN MALAYSIA
1 International Journal of Advance Research, IJOAR.org Volume 1, Issue 2, MAY 2013, Online: ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT
More informationOn the purpose of testing:
Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase
More informationOn the Many Claims and Applications of the Latent Variable
On the Many Claims and Applications of the Latent Variable Science is an attempt to exploit this contact between our minds and the world, and science is also motivated by the limitations that result from
More informationTable of Contents. Preface to the third edition xiii. Preface to the second edition xv. Preface to the fi rst edition xvii. List of abbreviations xix
Table of Contents Preface to the third edition xiii Preface to the second edition xv Preface to the fi rst edition xvii List of abbreviations xix PART 1 Developing and Validating Instruments for Assessing
More informationPSYCHOMETRICS APPLIED TO HEALTHCARE PROFESSIONS EDUCATION
PSYCHOMETRICS APPLIED TO HEALTHCARE PROFESSIONS EDUCATION COURSE PROGRAMME Psychometric properties such as reliability and validity are essential components in the utility of assessment in medical education.
More informationA simulation study of person-fit in the Rasch model
Psychological Test and Assessment Modeling, Volume 58, 2016 (3), 531-563 A simulation study of person-fit in the Rasch model Richard Artner 1 Abstract The validation of individual test scores in the Rasch
More informationDifferential Item Functioning
Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item
More informationOn the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA
On the usefulness of the CEFR in the investigation of test versions content equivalence HULEŠOVÁ, MARTINA MASARY K UNIVERSITY, CZECH REPUBLIC Overview Background and research aims Focus on RQ2 Introduction
More informationCOMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS LAINE P. BRADSHAW
COMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS by LAINE P. BRADSHAW (Under the Direction of Jonathan Templin and Karen Samuelsen) ABSTRACT
More informationCHAPTER - III METHODOLOGY CONTENTS. 3.1 Introduction. 3.2 Attitude Measurement & its devices
102 CHAPTER - III METHODOLOGY CONTENTS 3.1 Introduction 3.2 Attitude Measurement & its devices 3.2.1. Prior Scales 3.2.2. Psychophysical Scales 3.2.3. Sigma Scales 3.2.4. Master Scales 3.3 Attitude Measurement
More informationA Comparison of Several Goodness-of-Fit Statistics
A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures
More informationAPPLYING THE RASCH MODEL TO PSYCHO-SOCIAL MEASUREMENT A PRACTICAL APPROACH
APPLYING THE RASCH MODEL TO PSYCHO-SOCIAL MEASUREMENT A PRACTICAL APPROACH Margaret Wu & Ray Adams Documents supplied on behalf of the authors by Educational Measurement Solutions TABLE OF CONTENT CHAPTER
More informationNegotiation Style Measurement Scale Development and Testing. Siegfried P Gudergan, Christine Mathies, Andrew Kyngdon, University of Technology, Sydney
Negotiation Style Measurement Scale Development and Testing Siegfried P Gudergan, Christine Mathies, Andrew Kyngdon, University of Technology, Sydney Stephen Kozicki Gordian Business Abstract Based on
More informationA COMPARISON OF BAYESIAN MCMC AND MARGINAL MAXIMUM LIKELIHOOD METHODS IN ESTIMATING THE ITEM PARAMETERS FOR THE 2PL IRT MODEL
International Journal of Innovative Management, Information & Production ISME Internationalc2010 ISSN 2185-5439 Volume 1, Number 1, December 2010 PP. 81-89 A COMPARISON OF BAYESIAN MCMC AND MARGINAL MAXIMUM
More informationCentre for Education Research and Policy
THE EFFECT OF SAMPLE SIZE ON ITEM PARAMETER ESTIMATION FOR THE PARTIAL CREDIT MODEL ABSTRACT Item Response Theory (IRT) models have been widely used to analyse test data and develop IRT-based tests. An
More informationINTERPRETING IRT PARAMETERS: PUTTING PSYCHOLOGICAL MEAT ON THE PSYCHOMETRIC BONE
The University of British Columbia Edgeworth Laboratory for Quantitative Educational & Behavioural Science INTERPRETING IRT PARAMETERS: PUTTING PSYCHOLOGICAL MEAT ON THE PSYCHOMETRIC BONE Anita M. Hubley,
More informationResearch Brief Reliability of the Static Risk Offender Need Guide for Recidivism (STRONG-R)
Research Brief Reliability of the Static Risk Offender Need Guide for Recidivism (STRONG-R) Xiaohan Mei, M.A. Zachary Hamilton, Ph.D. Washington State University 1 Reliability/Internal Consistency of STRONG-R
More informationTest Scaling and Value-Added Measurement
Test Scaling and Value-Added Measurement Dale Ballou Working Paper 2008-23 December 2008 LED BY IN COOPERATION WITH: The NaTioNal CeNTer on PerformaNCe incentives (NCPI) is charged by the federal government
More informationChapter 1 Introduction. Measurement Theory. broadest sense and not, as it is sometimes used, as a proxy for deterministic models.
Ostini & Nering - Chapter 1 - Page 1 POLYTOMOUS ITEM RESPONSE THEORY MODELS Chapter 1 Introduction Measurement Theory Mathematical models have been found to be very useful tools in the process of human
More informationAnalyzing Psychopathology Items: A Case for Nonparametric Item Response Theory Modeling
Psychological Methods 2004, Vol. 9, No. 3, 354 368 Copyright 2004 by the American Psychological Association 1082-989X/04/$12.00 DOI: 10.1037/1082-989X.9.3.354 Analyzing Psychopathology Items: A Case for
More informationDuring the past century, mathematics
An Evaluation of Mathematics Competitions Using Item Response Theory Jim Gleason During the past century, mathematics competitions have become part of the landscape in mathematics education. The first
More informationItem Response Theory: Methods for the Analysis of Discrete Survey Response Data
Item Response Theory: Methods for the Analysis of Discrete Survey Response Data ICPSR Summer Workshop at the University of Michigan June 29, 2015 July 3, 2015 Presented by: Dr. Jonathan Templin Department
More information