Selection of Linking Items
|
|
- Myles Dixon
- 5 years ago
- Views:
Transcription
1 Selection of Linking Items Subset of items that maximally reflect the scale information function Denote the scale information as Linear programming solver (in R, lp_solve 5.5) min(y) Subject to θ, θs, where 4, 3.95,, 3.95, 4}, 0, 1,, 0. 37
2 An example: Subscale 2 Sum of Information Functions for 6, 7, and 8 Item Linking Sets 38
3 An example: Subscale 3 39
4 Why Fisher information is useful? In multidimensional CAT The volume of the confidence ellipsoid around is proportional to the determinant of (Anderson, 1984) Maximize the determinant of the Fisher information matrix (Segall, 1996, Wang & Chang, 2011). D optimal method 40
5 Fisher information vs. confidence ellipse θ θ Σ (Wang, et al., 2013)
6 Fisher information vs. confidence ellipse θ θ Σ (Wang, et al., 2013)
7 Mini max mechanism Assuming there are three dimensions, then,,, det, det, det, 2 det, This criterion tends to pick the items that minimize the variance of the estimator lagging behind most 43
8 Item bank Information 44
9 Domain/Content balancing Constraint weighted D optimal (Wang et al., 2017) Suppose for each domain, we have maximum and minimum number of items set in advance, {, }, k=1,..,d # of items belong to domain k so far, and n is the current test length, is the maximum test length indicates whether item j belongs to domain k (Cheng, et al., 2009) =, = 45
10 A simulation study Sample size N=2,000 Multivariate normal, with mean of 0 s, and covariance matrix Σ= Maximum a Posteriori (MAP) is used, and prior is multivariate normal with mean of 0 s and Evaluation criterion: root mean squared error (RMSE) N 1 RMSE( )= ( ˆ ) 1 i1 i1 N i
11 Results: Domain level recovery D optimal ( ) vs. Random selection ( ) 47
12 Results: Domain level recovery D optimal ( ) vs. Constraint weighted D optimal ( ) 48
13 Results: Domain level recovery D optimal ( ) vs. Constraint weighted D optimal ( ) 49
14 Reducing Test Length 50
15 (0, 0, 0) Test Length 51 θ Confidence Interval
16 (2, 2, 2) Test Length 52 θ Confidence Interval
17 Variable length CAT: Stopping rule Start 300+ items 53
18 Stopping rule Start 300+ items When the measurement precision criterion is satisfied (Dodd, Koch & De Ayala, 1993; Boyd, Dodd, & Choi, 2010) 54
19 Stopping rule Start 300+ items (a) Volume of the confidence ellipsoid (D rule) (b) Sum of S.E. per domain θ (c) Maximum axis of the confidence ellipsoid (d) Kullback Leibler divergence between to consecutive posteriors (Wang et al., 2013) 55
20 Cumulated information growth Test Length 56 Determinant of Fisher information matrix
21 Stopping rule Start 300+ items 57
22 Stopping rule Start 300+ items 58
23 Stopping rule Start 300+ items When θ does not change much: theta convergence rule (T rule) 0.01 (Babcock & Weiss, 2012 Wang et al., 2017+) 59
24 Why T rule is secondary? 2PL interval of ( ), is in the (Chang & Ying, 2008) 60
25 Why T rule is secondary? 2PL interval of ( ), is in the It does not monotonically decrease when test length increases! Terminate test pre maturely (Wang et al., 2017+) 61
26 Why T rule is secondary? 2PL interval of ( ) Undermine test efficiency Usually, the SE( )<.2 (Dodd, et al., 1993), is in the 25 If hypothetically 1, satisfying <.01 then 50 (Wang et al., 2017+) 62
27 MGRM Simple structure,, 0: 1,, 2 :, 1,,,, 1:,,, exp, (Wang et al., 2017+) 63
28 MGRM Simple structure.5,, 0: 1,, 2 :, 1,,,, 1:,,, exp, (Wang et al., 2017+) 64
29 MGRM Complex structure If item j measures the pth trait (Wang et al., 2017+) 65
30 MGRM Complex structure If item j measures the pth trait pth element of The amount of information carried by item j (Wang et al., 2017+) 66
31 MGRM Complex structure If item j measures the pth trait (Wang et al., 2017+) 67
32 MGRM Complex structure If item j measures the pth trait If item j measures multiple traits (Wang et al., 2017+) 68
33 Primary vs. Secondary stopping rules Start Minimum test length 300+ items (Babcock & Weiss, 2012 Wang et al., 2017+) 69
34 Primary vs. Secondary stopping rules Start Minimum test length 300+ items If D rule is satisfied? (Wang et al., 2017+) 70
35 Primary vs. Secondary stopping rules Start Minimum test length 300+ items If D rule is satisfied? Yes No If T rule is satisfied? (Wang et al., 2017+) 71
36 Primary vs. Secondary stopping rules Start Minimum test length 300+ items If D rule is satisfied? Yes No If T rule is satisfied? Yes No Continue (Wang et al., 2017+) 72
37 Primary vs. Secondary stopping rules Start Minimum test length Maximum test length 300+ items If D rule is satisfied? 94.9% 28.5 Yes No If T rule is satisfied? Yes No Continue 5.1% 61.5 (Wang et al., 2017+) 73
38 Stopping rule results Applied Cognition Daily Activity Mobility SE θ 74
39 3D plot 75
40 Stopping rule Cont. Test length Overall precision Primary stop Mean SD Bias RMSE Determinant Actual Eventual % 76
41 Stopping rule Cont. Test length Overall precision Primary stop Mean SD Bias RMSE Determinant Actual Eventual % Test length Bias RMSE Stop End Stop End Stop End Mean SD N= N=
42 Outline Brief introduction to computerized adaptive testing (CAT) Multidimensional CAT Computerized Adaptive Testing to Direct Delivery of Hospital Based Rehabilitation (NIH R01HD079439, ) Item bank calibration Item selection Stopping rules Ongoing projects 78
43 Project I: Classification AM PAC Color Coded Stages FIM score FIM Stage Independent (Green) Supervision Contact Guard (Yellow) Assistance (Orange) Dependent (Red) Table 2. High 7 Independent Low 6 Modified independent High 5 Supervision Low 4 Contact guard High 2 3 Min Mod Assist Low 1 Max Assist Red 0 Dependent 79
44 Project I: Classification Multidimensional CAT + Post hoc classification Or Multidimensional Classification CAT? 80
45 Project II: Incorporating response time (Fan, Wang, et al., 2012; Wang, et al., 2013a, 2013b; Wang & Xu, 2015) Exploratory data analysis (analysis per batch first) Histogram of batch 1 response time of all person item combinations (SD= 21.28, Skew= 41.84). Red line stands for the 97.5% percentile (25.85). 81
46 Project II: Incorporating response time (Fan, Wang, et al., 2012; Wang, et al., 2013a, 2013b; Wang & Xu, 2015) Exploratory data analysis (analysis per batch first) After cutting the upper 2.5% of data (SD= 4.27, Skew= 1.23) 82
47 Project II: Incorporating response time (Fan, Wang, et al., 2012; Wang, et al., 2013a, 2013b; Wang & Xu, 2015) Exploratory data analysis (analysis per batch first) After log transformation 83
48 Project II: Incorporating response time (Fan, Wang, et al., 2012; Wang, et al., 2013a, 2013b; Wang & Xu, 2015) A hierarchical response time model (van der Linden, 2007) Population μ,, σ, Item Item Person θ Item φ, λ Person τ 84
49 Four different models EM algorithm (1) According to Molenaar, et al. (2015), we can reparameterize van der Linden (2007) s joint model as MGRM ( ) Correlation between and (2) Including interviewers as covariates, and the interviewer effects differ across items 85
50 Four different models EM algorithm (3) Including interviewers as covariates, and the interviewer effects differ across items by a same proportion (4) Including interviewers as fixed covariates 86
51 Model 1 Model 2 Model 3 Model 4 87
52 Model comparison & Results Equation # of Free Parameters AIC BIC Batch Batch Batch Batch Model 3 results (batch 1) θ θ θ θ Estimates of are: 0.591, and Compared to MGRM alone, adding response time results in higher item discrimination parameter estimates and smaller standard errors. 88
53 Concurrent calibration across 4 batches Adding response time information did not affect the item parameter estimates and their standard errors significantly; Adding response time information helped reduce the standard error of patients multidimensional latent trait estimates, but adding interviewer as a covariate did not result in further improvement. 89
54 Next steps II: Incorporating response time (Fan, Wang, et al., 2012; Wang, et al., 2013a, 2013b; Wang & Xu, 2015) A hierarchical response time model (van der Linden, 2007) Maximize item information per time unit Maximize 90
55 3 factors to consider Next steps III: DIF CAT (Wang, Weiss, & Wang, 2017) Gender (Male/Female) Education (College+/high school and below) Age (<65/65~90) 91
56 Example DIF items Gender How much difficulty do you currently have making decisions, such as what clothes you want to wear? (Applied Cognition), consistent with expert hypothesis. Age How much difficulty do you currently have removing a plastic lid from a hot beverage cup? (Daily activity) How much difficulty do you currently have climbing stairs step over step without a handrail? (Mobility) 92
57 How to deal with DIF in a CAT design? Items with extreme DIF delete? Items with small DIF keep? Doubly adaptive CAT using subgroup information to improve measurement precision (Wang et al., 2017) Allow DIF items to have different parameters per subgroup Constraint weighted D optimal 93
58 Project IV: Adaptive measure of change (Wang & Weiss, 2017, Wang, 2014) Specifying the MCAT to efficiently detect meaningful clinical change 94
59 Study I 95
60 Project IV: Adaptive measure of change (Wang & Weiss, 2017, Wang, 2014) θ Time 1 Time 2 96
61 Project IV: Adaptive measure of change (Wang & Weiss, 2017, Wang, 2014) Item selection? Select an item that can best differentiate null hypothesis (no individual change) from alternative hypothesis. Sequential hypothesis testing? Stopping rule Time 1 θ Time 2 maximize ˆ ˆpooled KLj( i2, i( L k 1) ) 97
62 Algorithms Web based delivery Data collection with MCAT Monitor item usage, and routinely recalibrate item parameters if needed (Chen & Wang, 2016) 98
63 My collaborators and team Dr. David Weiss University of Minnesota Dr. Andrea Cheville Mayo Clinic Research Assistants: Zhuoran Shang Shiyang Su 99
Item Selection in Polytomous CAT
Item Selection in Polytomous CAT Bernard P. Veldkamp* Department of Educational Measurement and Data-Analysis, University of Twente, P.O.Box 217, 7500 AE Enschede, The etherlands 6XPPDU\,QSRO\WRPRXV&$7LWHPVFDQEHVHOHFWHGXVLQJ)LVKHU,QIRUPDWLRQ
More informationComputerized Adaptive Testing With the Bifactor Model
Computerized Adaptive Testing With the Bifactor Model David J. Weiss University of Minnesota and Robert D. Gibbons Center for Health Statistics University of Illinois at Chicago Presented at the New CAT
More informationTermination Criteria in Computerized Adaptive Tests: Variable-Length CATs Are Not Biased. Ben Babcock and David J. Weiss University of Minnesota
Termination Criteria in Computerized Adaptive Tests: Variable-Length CATs Are Not Biased Ben Babcock and David J. Weiss University of Minnesota Presented at the Realities of CAT Paper Session, June 2,
More informationConstrained Multidimensional Adaptive Testing without intermixing items from different dimensions
Psychological Test and Assessment Modeling, Volume 56, 2014 (4), 348-367 Constrained Multidimensional Adaptive Testing without intermixing items from different dimensions Ulf Kroehne 1, Frank Goldhammer
More informationDesign for Targeted Therapies: Statistical Considerations
Design for Targeted Therapies: Statistical Considerations J. Jack Lee, Ph.D. Department of Biostatistics University of Texas M. D. Anderson Cancer Center Outline Premise General Review of Statistical Designs
More informationImpact of Violation of the Missing-at-Random Assumption on Full-Information Maximum Likelihood Method in Multidimensional Adaptive Testing
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute
More informationOn indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state
On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties
More informationMeasurement Efficiency for Fixed-Precision Multidimensional Computerized Adaptive Tests Paap, Muirne C. S.; Born, Sebastian; Braeken, Johan
University of Groningen Measurement Efficiency for Fixed-Precision Multidimensional Computerized Adaptive Tests Paap, Muirne C. S.; Born, Sebastian; Braeken, Johan Published in: Applied Psychological Measurement
More informationSmall-area estimation of mental illness prevalence for schools
Small-area estimation of mental illness prevalence for schools Fan Li 1 Alan Zaslavsky 2 1 Department of Statistical Science Duke University 2 Department of Health Care Policy Harvard Medical School March
More informationThe Classification Accuracy of Measurement Decision Theory. Lawrence Rudner University of Maryland
Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago, April 23-25, 2003 The Classification Accuracy of Measurement Decision Theory Lawrence Rudner University
More informationA Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests
A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests David Shin Pearson Educational Measurement May 007 rr0701 Using assessment and research to promote learning Pearson Educational
More informationSLEEP DISTURBANCE ABOUT SLEEP DISTURBANCE INTRODUCTION TO ASSESSMENT OPTIONS. 6/27/2018 PROMIS Sleep Disturbance Page 1
SLEEP DISTURBANCE A brief guide to the PROMIS Sleep Disturbance instruments: ADULT PROMIS Item Bank v1.0 Sleep Disturbance PROMIS Short Form v1.0 Sleep Disturbance 4a PROMIS Short Form v1.0 Sleep Disturbance
More informationComparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria
Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill
More informationApplications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis
DSC 4/5 Multivariate Statistical Methods Applications DSC 4/5 Multivariate Statistical Methods Discriminant Analysis Identify the group to which an object or case (e.g. person, firm, product) belongs:
More informationLec 02: Estimation & Hypothesis Testing in Animal Ecology
Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then
More informationItem Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses
Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,
More informationPanel: Using Structural Equation Modeling (SEM) Using Partial Least Squares (SmartPLS)
Panel: Using Structural Equation Modeling (SEM) Using Partial Least Squares (SmartPLS) Presenters: Dr. Faizan Ali, Assistant Professor Dr. Cihan Cobanoglu, McKibbon Endowed Chair Professor University of
More informationEffects of Ignoring Discrimination Parameter in CAT Item Selection on Student Scores. Shudong Wang NWEA. Liru Zhang Delaware Department of Education
Effects of Ignoring Discrimination Parameter in CAT Item Selection on Student Scores Shudong Wang NWEA Liru Zhang Delaware Department of Education Paper to be presented at the annual meeting of the National
More informationBayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions
Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions J. Harvey a,b, & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics
More informationChallenges in Developing Learning Algorithms to Personalize mhealth Treatments
Challenges in Developing Learning Algorithms to Personalize mhealth Treatments JOOLHEALTH Bar-Fit Susan A Murphy 01.16.18 HeartSteps SARA Sense 2 Stop Continually Learning Mobile Health Intervention 1)
More informationGMAC. Scaling Item Difficulty Estimates from Nonequivalent Groups
GMAC Scaling Item Difficulty Estimates from Nonequivalent Groups Fanmin Guo, Lawrence Rudner, and Eileen Talento-Miller GMAC Research Reports RR-09-03 April 3, 2009 Abstract By placing item statistics
More informationPaul Irwing, Manchester Business School
Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human
More informationA COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY
A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationClassical Psychophysical Methods (cont.)
Classical Psychophysical Methods (cont.) 1 Outline Method of Adjustment Method of Limits Method of Constant Stimuli Probit Analysis 2 Method of Constant Stimuli A set of equally spaced levels of the stimulus
More informationABOUT PHYSICAL ACTIVITY
PHYSICAL ACTIVITY A brief guide to the PROMIS Physical Activity instruments: PEDIATRIC PROMIS Pediatric Item Bank v1.0 Physical Activity PROMIS Pediatric Short Form v1.0 Physical Activity 4a PROMIS Pediatric
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationMultilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison
Group-Level Diagnosis 1 N.B. Please do not cite or distribute. Multilevel IRT for group-level diagnosis Chanho Park Daniel M. Bolt University of Wisconsin-Madison Paper presented at the annual meeting
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationDecision consistency and accuracy indices for the bifactor and testlet response theory models
University of Iowa Iowa Research Online Theses and Dissertations Summer 2014 Decision consistency and accuracy indices for the bifactor and testlet response theory models Lee James LaFond University of
More informationComparison of Computerized Adaptive Testing and Classical Methods for Measuring Individual Change
Comparison of Computerized Adaptive Testing and Classical Methods for Measuring Individual Change Gyenam Kim Kang Korea Nazarene University David J. Weiss University of Minnesota Presented at the Item
More informationA Comparison of Item and Testlet Selection Procedures. in Computerized Adaptive Testing. Leslie Keng. Pearson. Tsung-Han Ho
ADAPTIVE TESTLETS 1 Running head: ADAPTIVE TESTLETS A Comparison of Item and Testlet Selection Procedures in Computerized Adaptive Testing Leslie Keng Pearson Tsung-Han Ho The University of Texas at Austin
More informationBayesian Nonparametric Methods for Precision Medicine
Bayesian Nonparametric Methods for Precision Medicine Brian Reich, NC State Collaborators: Qian Guan (NCSU), Eric Laber (NCSU) and Dipankar Bandyopadhyay (VCU) University of Illinois at Urbana-Champaign
More informationOutlier Analysis. Lijun Zhang
Outlier Analysis Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Extreme Value Analysis Probabilistic Models Clustering for Outlier Detection Distance-Based Outlier Detection Density-Based
More informationPatient Reported Outcomes in Clinical Research. Overview 11/30/2015. Why measure patientreported
Patient Reported Outcomes in Clinical Research Kevin P. Weinfurt, PhD Introduction to the Principles and Practice of Clinical Research National Institutes of Health November 30, 2015 Overview 1. Why measure
More informationAnalyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi
Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT Amin Mousavi Centre for Research in Applied Measurement and Evaluation University of Alberta Paper Presented at the 2013
More informationT-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design
T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design James A. Bolognese, Cytel Nitin Patel, Cytel Yevgen Tymofyeyef, Merck Inna Perevozskaya, Wyeth
More informationUsing the Score-based Testlet Method to Handle Local Item Dependence
Using the Score-based Testlet Method to Handle Local Item Dependence Author: Wei Tao Persistent link: http://hdl.handle.net/2345/1363 This work is posted on escholarship@bc, Boston College University Libraries.
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More informationA Bayesian Nonparametric Model Fit statistic of Item Response Models
A Bayesian Nonparametric Model Fit statistic of Item Response Models Purpose As more and more states move to use the computer adaptive test for their assessments, item response theory (IRT) has been widely
More informationThe Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming
Curr Psychol (2009) 28:194 201 DOI 10.1007/s12144-009-9058-x The Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming C. K. John Wang & W. C. Liu & A. Khoo Published online: 27 May
More informationUsing the Distractor Categories of Multiple-Choice Items to Improve IRT Linking
Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking Jee Seon Kim University of Wisconsin, Madison Paper presented at 2006 NCME Annual Meeting San Francisco, CA Correspondence
More informationComputerized Adaptive Testing
Computerized Adaptive Testing Daniel O. Segall Defense Manpower Data Center United States Department of Defense Encyclopedia of Social Measurement, in press OUTLINE 1. CAT Response Models 2. Test Score
More informationOn Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015
On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:
More informationFATIGUE. A brief guide to the PROMIS Fatigue instruments:
FATIGUE A brief guide to the PROMIS Fatigue instruments: ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS Ca Bank v1.0 Fatigue PROMIS Pediatric Bank v2.0 Fatigue PROMIS Pediatric Bank v1.0 Fatigue* PROMIS
More informationTable of Contents. Preface to the third edition xiii. Preface to the second edition xv. Preface to the fi rst edition xvii. List of abbreviations xix
Table of Contents Preface to the third edition xiii Preface to the second edition xv Preface to the fi rst edition xvii List of abbreviations xix PART 1 Developing and Validating Instruments for Assessing
More informationApplication of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties
Application of Local Control Strategy in analyses of the effects of Radon on Lung Cancer Mortality for 2,881 US Counties Bob Obenchain, Risk Benefit Statistics, August 2015 Our motivation for using a Cut-Point
More informationThe use of predicted values for item parameters in item response theory models: an. application in intelligence tests
The use of predicted values for item parameters in item response theory models: an application in intelligence tests Mariagiulia Matteucci, Stefania Mignani, and Bernard P. Veldkamp ABSTRACT In testing,
More informationAuthor's response to reviews
Author's response to reviews Title: Comparison of two Bayesian methods to detect mode effects between paper-based and computerized adaptive assessments: A preliminary Monte Carlo study Authors: Barth B.
More informationThe Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 7-1-2001 The Relative Performance of
More informationMichael Hallquist, Thomas M. Olino, Paul A. Pilkonis University of Pittsburgh
Comparing the evidence for categorical versus dimensional representations of psychiatric disorders in the presence of noisy observations: a Monte Carlo study of the Bayesian Information Criterion and Akaike
More informationPSYCHOLOGICAL STRESS EXPERIENCES
PSYCHOLOGICAL STRESS EXPERIENCES A brief guide to the PROMIS Pediatric and Parent Proxy Report Psychological Stress Experiences instruments: PEDIATRIC PROMIS Pediatric Item Bank v1.0 Psychological Stress
More informationLearning from data when all models are wrong
Learning from data when all models are wrong Peter Grünwald CWI / Leiden Menu Two Pictures 1. Introduction 2. Learning when Models are Seriously Wrong Joint work with John Langford, Tim van Erven, Steven
More informationINTRODUCTION TO ASSESSMENT OPTIONS
ASTHMA IMPACT A brief guide to the PROMIS Asthma Impact instruments: PEDIATRIC PROMIS Pediatric Item Bank v2.0 Asthma Impact PROMIS Pediatric Item Bank v1.0 Asthma Impact* PROMIS Pediatric Short Form v2.0
More informationSmoking Social Motivations
Smoking Social Motivations A brief guide to the PROMIS Smoking Social Motivations instruments: ADULT PROMIS Item Bank v1.0 Smoking Social Motivations for All Smokers PROMIS Item Bank v1.0 Smoking Social
More informationBasic concepts and principles of classical test theory
Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 39 Evaluation of Comparability of Scores and Passing Decisions for Different Item Pools of Computerized Adaptive Examinations
More informationAdaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida
Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models
More informationStatistical Audit. Summary. Conceptual and. framework. MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy)
Statistical Audit MICHAELA SAISANA and ANDREA SALTELLI European Commission Joint Research Centre (Ispra, Italy) Summary The JRC analysis suggests that the conceptualized multi-level structure of the 2012
More informationModelling Spatially Correlated Survival Data for Individuals with Multiple Cancers
Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Dipak K. Dey, Ulysses Diva and Sudipto Banerjee Department of Statistics University of Connecticut, Storrs. March 16,
More informationLinking Errors in Trend Estimation in Large-Scale Surveys: A Case Study
Research Report Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Xueli Xu Matthias von Davier April 2010 ETS RR-10-10 Listening. Learning. Leading. Linking Errors in Trend Estimation
More informationBayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data
Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University
More informationAssociate Prof. Dr Anne Yee. Dr Mahmoud Danaee
Associate Prof. Dr Anne Yee Dr Mahmoud Danaee 1 2 What does this resemble? Rorschach test At the end of the test, the tester says you need therapy or you can't work for this company 3 Psychological Testing
More informationIntroduction to Item Response Theory
Introduction to Item Response Theory Prof John Rust, j.rust@jbs.cam.ac.uk David Stillwell, ds617@cam.ac.uk Aiden Loe, bsl28@cam.ac.uk Luning Sun, ls523@cam.ac.uk www.psychometrics.cam.ac.uk Goals Build
More informationAn Empirical Bayes Approach to Subscore Augmentation: How Much Strength Can We Borrow?
Journal of Educational and Behavioral Statistics Fall 2006, Vol. 31, No. 3, pp. 241 259 An Empirical Bayes Approach to Subscore Augmentation: How Much Strength Can We Borrow? Michael C. Edwards The Ohio
More informationPHYSICAL STRESS EXPERIENCES
PHYSICAL STRESS EXPERIENCES A brief guide to the PROMIS Physical Stress Experiences instruments: PEDIATRIC PROMIS Pediatric Bank v1.0 - Physical Stress Experiences PROMIS Pediatric Short Form v1.0 - Physical
More informationA Modified CATSIB Procedure for Detecting Differential Item Function. on Computer-Based Tests. Johnson Ching-hong Li 1. Mark J. Gierl 1.
Running Head: A MODIFIED CATSIB PROCEDURE FOR DETECTING DIF ITEMS 1 A Modified CATSIB Procedure for Detecting Differential Item Function on Computer-Based Tests Johnson Ching-hong Li 1 Mark J. Gierl 1
More informationINTRODUCTION TO ASSESSMENT OPTIONS
DEPRESSION A brief guide to the PROMIS Depression instruments: ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS-Ca Bank v1.0 Depression PROMIS Pediatric Item Bank v2.0 Depressive Symptoms PROMIS Pediatric
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationIntroduction to Statistical Data Analysis I
Introduction to Statistical Data Analysis I JULY 2011 Afsaneh Yazdani Preface What is Statistics? Preface What is Statistics? Science of: designing studies or experiments, collecting data Summarizing/modeling/analyzing
More informationPSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5
PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science Homework 5 Due: 21 Dec 2016 (late homeworks penalized 10% per day) See the course web site for submission details.
More informationBayesOpt: Extensions and applications
BayesOpt: Extensions and applications Javier González Masterclass, 7-February, 2107 @Lancaster University Agenda of the day 9:00-11:00, Introduction to Bayesian Optimization: What is BayesOpt and why it
More informationFactor Analysis of Gulf War Illness: What Does It Add to Our Understanding of Possible Health Effects of Deployment?
October 3, 2006 Factor Analysis Examples: Example 1: Factor Analysis of Gulf War Illness: What Does It Add to Our Understanding of Possible Health Effects of Deployment? 1 2 2 Susan E. Shapiro, Michael
More informationABOUT SMOKING NEGATIVE PSYCHOSOCIAL EXPECTANCIES
Smoking Negative Psychosocial Expectancies A brief guide to the PROMIS Smoking Negative Psychosocial Expectancies instruments: ADULT PROMIS Item Bank v1.0 Smoking Negative Psychosocial Expectancies for
More informationMEANING AND PURPOSE. ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a
MEANING AND PURPOSE A brief guide to the PROMIS Meaning and Purpose instruments: ADULT PEDIATRIC PARENT PROXY PROMIS Item Bank v1.0 Meaning and Purpose PROMIS Short Form v1.0 Meaning and Purpose 4a PROMIS
More informationNonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref)
Qual Life Res (2008) 17:275 290 DOI 10.1007/s11136-007-9281-6 Nonparametric IRT analysis of Quality-of-Life Scales and its application to the World Health Organization Quality-of-Life Scale (WHOQOL-Bref)
More informationMultidimensional Modeling of Learning Progression-based Vertical Scales 1
Multidimensional Modeling of Learning Progression-based Vertical Scales 1 Nina Deng deng.nina@measuredprogress.org Louis Roussos roussos.louis@measuredprogress.org Lee LaFond leelafond74@gmail.com 1 This
More informationBiostatistical modelling in genomics for clinical cancer studies
This work was supported by Entente Cordiale Cancer Research Bursaries Biostatistical modelling in genomics for clinical cancer studies Philippe Broët JE 2492 Faculté de Médecine Paris-Sud In collaboration
More information10-1 MMSE Estimation S. Lall, Stanford
0 - MMSE Estimation S. Lall, Stanford 20.02.02.0 0 - MMSE Estimation Estimation given a pdf Minimizing the mean square error The minimum mean square error (MMSE) estimator The MMSE and the mean-variance
More informationLikelihood Ratio Based Computerized Classification Testing. Nathan A. Thompson. Assessment Systems Corporation & University of Cincinnati.
Likelihood Ratio Based Computerized Classification Testing Nathan A. Thompson Assessment Systems Corporation & University of Cincinnati Shungwon Ro Kenexa Abstract An efficient method for making decisions
More informationScaling TOWES and Linking to IALS
Scaling TOWES and Linking to IALS Kentaro Yamamoto and Irwin Kirsch March, 2002 In 2000, the Organization for Economic Cooperation and Development (OECD) along with Statistics Canada released Literacy
More informationClinical trials with incomplete daily diary data
Clinical trials with incomplete daily diary data N. Thomas 1, O. Harel 2, and R. Little 3 1 Pfizer Inc 2 University of Connecticut 3 University of Michigan BASS, 2015 Thomas, Harel, Little (Pfizer) Clinical
More informationIDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS
IDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS A Dissertation Presented to The Academic Faculty by HeaWon Jun In Partial Fulfillment of the Requirements
More informationThe Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing
The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing Terry A. Ackerman University of Illinois This study investigated the effect of using multidimensional items in
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More information10CS664: PATTERN RECOGNITION QUESTION BANK
10CS664: PATTERN RECOGNITION QUESTION BANK Assignments would be handed out in class as well as posted on the class blog for the course. Please solve the problems in the exercises of the prescribed text
More informationIdentification of Tissue Independent Cancer Driver Genes
Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important
More informationPHYSICAL FUNCTION A brief guide to the PROMIS Physical Function instruments:
PROMIS Bank v1.0 - Physical Function* PROMIS Short Form v1.0 Physical Function 4a* PROMIS Short Form v1.0-physical Function 6a* PROMIS Short Form v1.0-physical Function 8a* PROMIS Short Form v1.0 Physical
More informationPACKER: An Exemplar Model of Category Generation
PCKER: n Exemplar Model of Category Generation Nolan Conaway (nconaway@wisc.edu) Joseph L. usterweil (austerweil@wisc.edu) Department of Psychology, 1202 W. Johnson Street Madison, WI 53706 US bstract
More informationinvestigate. educate. inform.
investigate. educate. inform. Research Design What drives your research design? The battle between Qualitative and Quantitative is over Think before you leap What SHOULD drive your research design. Advanced
More informationScoring Multiple Choice Items: A Comparison of IRT and Classical Polytomous and Dichotomous Methods
James Madison University JMU Scholarly Commons Department of Graduate Psychology - Faculty Scholarship Department of Graduate Psychology 3-008 Scoring Multiple Choice Items: A Comparison of IRT and Classical
More informationPAIN INTERFERENCE. ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS-Ca Bank v1.1 Pain Interference PROMIS-Ca Bank v1.0 Pain Interference*
PROMIS Item Bank v1.1 Pain Interference PROMIS Item Bank v1.0 Pain Interference* PROMIS Short Form v1.0 Pain Interference 4a PROMIS Short Form v1.0 Pain Interference 6a PROMIS Short Form v1.0 Pain Interference
More informationCentre for Education Research and Policy
THE EFFECT OF SAMPLE SIZE ON ITEM PARAMETER ESTIMATION FOR THE PARTIAL CREDIT MODEL ABSTRACT Item Response Theory (IRT) models have been widely used to analyse test data and develop IRT-based tests. An
More informationComputerized Mastery Testing
Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating
More informationAn Introduction to Bayesian Statistics
An Introduction to Bayesian Statistics Robert Weiss Department of Biostatistics UCLA Fielding School of Public Health robweiss@ucla.edu Sept 2015 Robert Weiss (UCLA) An Introduction to Bayesian Statistics
More informationDifferential Item Functioning
Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item
More informationItem Response Theory. Author's personal copy. Glossary
Item Response Theory W J van der Linden, CTB/McGraw-Hill, Monterey, CA, USA ã 2010 Elsevier Ltd. All rights reserved. Glossary Ability parameter Parameter in a response model that represents the person
More information