Differential Item Functioning from a Compensatory-Noncompensatory Perspective
|
|
- Maude Holmes
- 5 years ago
- Views:
Transcription
1 Differential Item Functioning from a Compensatory-Noncompensatory Perspective Terry Ackerman, Bruce McCollaum, Gilbert Ngerano University of North Carolina at Greensboro
2 Motivation for my Presentation Differential Item functioning, DIF, has become a standard analysis in achievement testing. Its purpose is to insure test impartiality and identify items which are unfair, favoring one group of examinees over another. Given the high stakes surrounding many educational tests today, DIF analyses have increased in importance. DIF has the greatest potential to occur when a test is multidimensional, that is, containing items that measure, to varying degrees, superfluous, invalid skills that are different than the purported purpose of the test.
3 Motivation for my Presentation If items measure invalid skills, and examinees differ on those skills DIF is likely to result. Some examinees will end up getting those items right, not because they are competent in the skill or composite skills being measured by the test, but because they are more able on an unessential skill being measured by a DIF item.
4 Motivation for my Presentation A little strange example: me taking an algebra math test written in Turkish. Now while I should be pretty good at algebra I know close to nothing about the Turkish language. Thus, no matter how good I am at algebra, it can not compensate for my not understanding Turkish and thus my probability of correct response on these items would be very low. I would be stuck in a noncompensatory situation.
5 Motivation for my Presentation Note, that DIF cannot occur if a test is strictly unidimensional, measuring only one skill, factor, or trait. DIF is thought to occur when a test measures invalid skills and groups of examinees differ in their underlying distribution of abilities on those skills. Detecting DIF is a relatively easy process, the true challenge is determining what caused it!
6 Goal of my Presentation The purpose of my talk today is to explain a new cause of DIF. It is a situation in which a test is measuring both valid and invalid skills but groups of examinees have identical underlying ability distributions, but use the information presented in an item differently. That is for some examinees an item is compensatory, while for other examinees the item perhaps because of greater exposure of the requisite information or instructional or pedagogical differences) the item is noncompensatory. To explain this new source of DIF I m going to back up a bit and give a brief background in Multidimensional IRT modeling and DIF analyses.
7 Konuşma Akışı 1. Çok Boyutlu Madde Tepki Kuramı Modelleri: a) Tamamlayıcı b) Tamamlayıcı Olmayan 2. Çok Boyutlu Madde Tepki Kuramında Madde Gösterimi 3. Çok Boyutlu Perspektiften Değişen Madde Fonksiyonuna (DMF) Bakış 4. DMF: Tamamlayıcı İşlemlere Karşı Tamamlayıcı Olmayan İşlemler 5. Örnek Uygulamalar 6. Sonuç ve Gelecekteki Yönelimler 7. Kϋҫϋk sinav
8 1. Multidimensional IRT Models: Compensatory vs. Noncompensatory
9 The Two-dimensional Compensatory Model The probability of examinee j correctly responding to item i, can be expressed as: P ij 1.0 e a 1i 1 j a 2i 2 j d j 2 Discrimination Parameters 2 Latent abilities 1 Difficulty Parameter
10 The Two-dimensional Noncompensatory Model The probability of examinee j correctly responding to item i can be expressed as: P ij 1.0 e a 1i1 j b1 j 1. 7 a i j b e i 2 Discrimination Parameters 2 Latent abilities 2 Difficulty Parameter
11 2. Multidimensional IRT: item representation
12 Mathematica Representations
13 Contour Plot of Item Response Surface a1 = 1.50 a2 = 0.0 d = 0.3 A B C This item only discriminates between levels of 1 The steeper the surface, the more discrimination, the closer the contours.
14 Contour Plot of Item Response Surface a1 = 0.0 a2 = 0.8 d = 0.3 C This item only discriminates between levels of 2 A B The flatter the surface, the less discrimination, the further apart the contours
15 Contour Plot of Item Response Surface a1 = 1.0 a2 = 1.0 d = 0.3 A Low 1 High 2 This item discriminates between an equal composite of 1 and 2 Notice examinees with opposite ability profiles have the same probability of correct answer (i.e., compensation). B High 1 Low 2
16 Noncompensatory Model Contour Plot of Item Response Surface a1 = 1.0 a2 = 1.0 b1 = 0.0 b2 = 0.0 High 2 Low 1 A B No compensation occurs for being high on only one ability Low 1 C Low 2 High 1 Low 2
17 Perhaps the best representation of twodimensional is the vector method. Each item is represented in the latent ability plane as a vector. All vectors lie on lines that pass through the origin. Vectors can lie only in the first and third quadrants because when we estimate the a- parameters they are constrained to be positive Vectors representing easy items lie in the third quadrant; those representing difficult items lie in the first quadrant.
18 The length of the vector indicates how well an item can discriminate between levels of skill. This value is called MDISC. MDISC 2 2 a 1 a2 The tail of the vector lies on the p=.5 equiprobability contour. The signed distance from the origin to this contour is denoted as D D d MDISC The angular direction, α, indicates the composite of ability that the item is best measuring cos 1 a 1 MDISC
19 Vectors are actually projections of the direction (i.e., 1, 2 composite) of maximum discrimination or slope, onto the latent ability plane Direction of maximum slope Response surface Projected item vector
20 Contour Plot of Item Response Surface a1 = 1.8 a2 = 1.0 d = 0.8 Item response vector p =.5 equiprobability contour
21 By color coding the vectors to match different content areas we can determine Are items from a certain content area more discriminating or more difficult? Do different items from different content areas measure different ability composites? How similar are the vector profiles for different yet parallel forms?
22 Example of item vectors for the 101 item LSAT Difficult items Easy items
23 3. Differential Item Functioning from a multidimensional perspective
24 DIF Analyses DIF is examined in terms of differential performance between two identified groups, which are usually denoted as the Reference Group and the Focal Group. DIF analyses usually focus on one item at a time using conditional analyses, where intermediate statistics are calculated for each raw score category and then summed. Although there are many types of DIF analyses, for today I will focus on two dichotomously scored approaches, SIBTEST and Mantel Haenszel. At the heart of each conditional analysis is a 2 x 2 contingency table.
25
26 Mantel Haenszel DIF Statistic MH i j A j E j j j E Var A j A A j j N 1 2 Where the expected value of cell A frequency is R N N.. j 2 1. j and the variance of cell A frequencies equal N RjN FjN1. j N0. j Var A 2 N N 1.. j.. j 2x2 Contingency Table for the jth Score Category Item Score Group 1 0 Total Reference (R) A j B j N Rj Focal (F) C j D j N Fj Total N 1.j N 0.j N..j
27 n h Fh Rh h U Y Y p 0 * * ˆ ˆ n j Fh Rj Fh Rh h G G G G p 0 ˆ and G Rh and G Fh are the number of examinees in the reference and focal groups at valid score X = h. U U B U ˆ ˆ ˆ , ˆ 1, ˆ 1 ˆ ˆ ˆ n h Fh Rh k U F h Y G R h Y G p The SIBTEST test statistic is calculated as where SIBTEST DIF Statistic An estimate of the numerator of the SIBTEST test statistic is where
28 3. Differential Item Functioning from a multidimensional perspective
29 Key Ingredient in DIF analyses: the Conditioning Variable DIF occurs because the conditioning variable does not capture all of the skills (complete latent space) that the groups of examinees utilized in responding to the test items. Several studies have looked at conditioning scores and how to account for all the skills examinees have used in responding to items on a test.
30 Shin (1992) Zwick & Ercikan (1989) Condition on Skill 2 Condition on Skill 1
31 Ackerman & Evans 1994 DIF Study Generated Ability Distributions Generated Items
32 Conditioning on θ 2 Valid Skill Valid Composite Direction DIF Items Invalid Skill
33 Conditioning on θ 1 DIF Items Invalid Skill Valid Composite Direction Valid Skill
34 Conditioning on raw score DIF Items Invalid Skill DIF Items Invalid Skill
35 Conditioning on θ 1 and θ 2 All items (composites) are valid Valid Skill Valid Skill
36
37
38 4. DIF: compensatory processing versus noncompensatory processing
39 Identical Generating Distributions N = 1000 Mean Std Dev rt1t2 REF Theta Theta FOC Theta Theta
40 Vectors of Generated Items n = 30
41 Compensatory Item 13 a1 =.4 a2 =.4 d =.0 Noncompensatory Item 13 a1 = 1.2 a2 = 1.2 b1 =.0 b2 =.0
42 Compensatory Item 14 a1 =.8 a2 =.8 d =.0 Noncompensatory Item 14 a1 = 0.8 a2 = 0.8 b1 =.0 b2 =.0
43 Compensatory Item 15 a1 = 1.2 a2 = 1.2 d =.0 Noncompensatory Item 15 a1 = 1.2 a2 = 1.2 b1 =.0 b2 =.0
44 Compensatory Item 16 a1 = 1.6 a2 = 1.6 d =.0 Noncompensatory Item 16 a1 = 1.6 a2 = 1.6 b1 =.0 b2=.0
45 Raw Score Frequency Raw Score Frequency Reference Group S C O R E Focal Group 0
46 Item Reference Group Focal Group Type Compensatory Noncompensatory p-value biserial p-value biserial
47
48
49 Item 2 Item 15
50 ETS DIF Classification Categories Category A B C MH D-DIF value MH D-DIF < < MH D DIF < 1.5 MH D DIF >1.5 During Test Assembly Select freely If possible select Equivalent item with smaller MH D-DIF Select ONLY if Essential; Independent Reviewer required Action Before Score Reporting Independent reviewer required
51 A B C
52 5. Example applications
53 Situations in which compensation differences between subgroups could occur 1. Teaching Literacy Phonemic awareness Phonics Reading Fluency including oral reading skills Vocabulary Development Reading Comprehension Strategies Teacher Training - Content Knowledge vs Pedagogical Knowledge Praxis II English Language Learners
54 Situations in which compensation differences between subgroups could occur 2. English Language Learners students whose first language is not English 3. Teacher Training - Content knowledge vs pedagogical knowledge (Praxis II)
55 6. Conclusion and future directions
56 Conclusions DIF is a very perplexing analysis to perform. Quite often when we identify items that are favoring one group or another, we still can not determine what caused the DIF. Hopefully by applying multidimensional modeling we might be able to expand on why groups of students perform differentially. Such analyses, especially those involving compensation and lack of compensation could be potentially very instructive and prescriptive for teachers and help inform pedagogical practice.
57 Future Work More work needs to be done on how best to represent items in a noncompensatory framework. I am working closely with my doctoral students to look at the ways we feel DIF can occur through lack of compensation. This includes developing items that have distractors representing varying degrees of compensation. One of my students is also looking at latent class mixture models using the compensatory and noncompensatory MIRT models to identify classes of students who lack requisite skills and thus are facing noncompensatory testing scenarios.
58 Kϋҫϋk sinav
59 Being the great psychometricians that you are which group do you think this ACT item favored, Whites? Blacks? Males? Females? No DIF? BLACK EXAMINEES
60 Which group do you think this ACT item favored, Whites? Blacks? Males? Females? No DIF? A rectangular 8-inch by 10-inch picture is to be framed with a 3-inch border all the way around it. How many more square inches of wall space will be covered by the framed picture than by the picture alone? a) 24 b) 48 c) 54 d) 108 e) 144 WHITE EXAMINEES
61 For questions or comments please me at
62 References Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional. Applied Psychological Measurement, 13, Ackerman, T. A. (1992). An explanation of differential item functioning from a multidimensional perspective. Journal of Educational Measurement, 24, Ackerman, T. A. (1994). The Influence of Conditioning Scores In Performing DIF Analyses. Applied Psychological Measurement, 18, 4, Ackerman, T. A., & Evans, J. A. (1992, April). An investigation of the relationship between reliability, power, and the Type I error rate of the Mantel-Haenszel and simultaneous item bias detection procedures. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco. Ackerman, T.A. & Henson, R. A. (2014 ) Graphical representations of items and tests that are measuring multiple abilities. Proceedings of the Psychometric Society. IMPS 2013.
63 Dorans, N. J., & Holland, P. W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In H. Wainer & P. W. Holland (Eds.), Differential item functioning, (pp ). Hillsdale NJ: Erlbaum. Shin, S. (1992). An empirical investigation of the robustness of the Mantel-Haenszel procedure and sources of differential item functioning. Dissertation Abstracts International, 53A, Zwick, R., & Ercikan, K. (1989). Analysis of differential item functioning in the NAEP History Assessment. Journal of Educational Measurement, 26,
64 "Teşekkürler"
The Influence of Conditioning Scores In Performing DIF Analyses
The Influence of Conditioning Scores In Performing DIF Analyses Terry A. Ackerman and John A. Evans University of Illinois The effect of the conditioning score on the results of differential item functioning
More informationThe Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing
The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing Terry A. Ackerman University of Illinois This study investigated the effect of using multidimensional items in
More informationDifferential Item Functioning
Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item
More informationGraphical Representation of Multidimensional
Graphical Representation of Multidimensional Item Response Theory Analyses Terry Ackerman University of Illinois, Champaign-Urbana This paper illustrates how graphical analyses can enhance the interpretation
More informationSection 5. Field Test Analyses
Section 5. Field Test Analyses Following the receipt of the final scored file from Measurement Incorporated (MI), the field test analyses were completed. The analysis of the field test data can be broken
More informationInternational Journal of Education and Research Vol. 5 No. 5 May 2017
International Journal of Education and Research Vol. 5 No. 5 May 2017 EFFECT OF SAMPLE SIZE, ABILITY DISTRIBUTION AND TEST LENGTH ON DETECTION OF DIFFERENTIAL ITEM FUNCTIONING USING MANTEL-HAENSZEL STATISTIC
More informationWhen can Multidimensional Item Response Theory (MIRT) Models be a Solution for. Differential Item Functioning (DIF)? A Monte Carlo Simulation Study
When can Multidimensional Item Response Theory (MIRT) Models be a Solution for Differential Item Functioning (DIF)? A Monte Carlo Simulation Study Yuan-Ling Liaw A dissertation submitted in partial fulfillment
More informationComparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria
Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill
More informationNoncompensatory. A Comparison Study of the Unidimensional IRT Estimation of Compensatory and. Multidimensional Item Response Data
A C T Research Report Series 87-12 A Comparison Study of the Unidimensional IRT Estimation of Compensatory and Noncompensatory Multidimensional Item Response Data Terry Ackerman September 1987 For additional
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationJason L. Meyers. Ahmet Turhan. Steven J. Fitzpatrick. Pearson. Paper presented at the annual meeting of the
Performance of Ability Estimation Methods for Writing Assessments under Conditio ns of Multidime nsionality Jason L. Meyers Ahmet Turhan Steven J. Fitzpatrick Pearson Paper presented at the annual meeting
More informationItem Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses
Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,
More informationNonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia
Nonparametric DIF Nonparametric IRT Methodology For Detecting DIF In Moderate-To-Small Scale Measurement: Operating Characteristics And A Comparison With The Mantel Haenszel Bruno D. Zumbo and Petronilla
More informationImprovements for Differential Functioning of Items and Tests (DFIT): Investigating the Addition of Reporting an Effect Size Measure and Power
Georgia State University ScholarWorks @ Georgia State University Educational Policy Studies Dissertations Department of Educational Policy Studies Spring 5-7-2011 Improvements for Differential Functioning
More informationKeywords: Dichotomous test, ordinal test, differential item functioning (DIF), magnitude of DIF, and test-takers. Introduction
Comparative Analysis of Generalized Mantel Haenszel (GMH), Simultaneous Item Bias Test (SIBTEST), and Logistic Discriminant Function Analysis (LDFA) methods in detecting Differential Item Functioning (DIF)
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationTHE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH
THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH By JANN MARIE WISE MACINNES A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF
More informationLinking Mixed-Format Tests Using Multiple Choice Anchors. Michael E. Walker. Sooyeon Kim. ETS, Princeton, NJ
Linking Mixed-Format Tests Using Multiple Choice Anchors Michael E. Walker Sooyeon Kim ETS, Princeton, NJ Paper presented at the annual meeting of the American Educational Research Association (AERA) and
More informationMantel-Haenszel Procedures for Detecting Differential Item Functioning
A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of
More informationGMAC. Scaling Item Difficulty Estimates from Nonequivalent Groups
GMAC Scaling Item Difficulty Estimates from Nonequivalent Groups Fanmin Guo, Lawrence Rudner, and Eileen Talento-Miller GMAC Research Reports RR-09-03 April 3, 2009 Abstract By placing item statistics
More informationIRT Parameter Estimates
An Examination of the Characteristics of Unidimensional IRT Parameter Estimates Derived From Two-Dimensional Data Timothy N. Ansley and Robert A. Forsyth The University of Iowa The purpose of this investigation
More informationBruno D. Zumbo, Ph.D. University of Northern British Columbia
Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.
More informationManifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses
Journal of Modern Applied Statistical Methods Copyright 2005 JMASM, Inc. May, 2005, Vol. 4, No.1, 275-282 1538 9472/05/$95.00 Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement
More informationA Monte Carlo Study Investigating Missing Data, Differential Item Functioning, and Effect Size
Georgia State University ScholarWorks @ Georgia State University Educational Policy Studies Dissertations Department of Educational Policy Studies 8-12-2009 A Monte Carlo Study Investigating Missing Data,
More informationGENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS
GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at
More informationA DIFFERENTIAL RESPONSE FUNCTIONING FRAMEWORK FOR UNDERSTANDING ITEM, BUNDLE, AND TEST BIAS ROBERT PHILIP SIDNEY CHALMERS
A DIFFERENTIAL RESPONSE FUNCTIONING FRAMEWORK FOR UNDERSTANDING ITEM, BUNDLE, AND TEST BIAS ROBERT PHILIP SIDNEY CHALMERS A DISSERTATION SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILMENT
More informationDetermining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory
Determining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory Teodora M. Salubayba St. Scholastica s College-Manila dory41@yahoo.com Abstract Mathematics word-problem
More informationSensitivity of DFIT Tests of Measurement Invariance for Likert Data
Meade, A. W. & Lautenschlager, G. J. (2005, April). Sensitivity of DFIT Tests of Measurement Invariance for Likert Data. Paper presented at the 20 th Annual Conference of the Society for Industrial and
More informationUsing the Distractor Categories of Multiple-Choice Items to Improve IRT Linking
Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking Jee Seon Kim University of Wisconsin, Madison Paper presented at 2006 NCME Annual Meeting San Francisco, CA Correspondence
More informationThe Matching Criterion Purification for Differential Item Functioning Analyses in a Large-Scale Assessment
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 1-2016 The Matching Criterion Purification
More informationBuilding Evaluation Scales for NLP using Item Response Theory
Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged
More informationDiagnostic Classification Models
Diagnostic Classification Models Lecture #13 ICPSR Item Response Theory Workshop Lecture #13: 1of 86 Lecture Overview Key definitions Conceptual example Example uses of diagnostic models in education Classroom
More informationPublished by European Centre for Research Training and Development UK (
DETERMINATION OF DIFFERENTIAL ITEM FUNCTIONING BY GENDER IN THE NATIONAL BUSINESS AND TECHNICAL EXAMINATIONS BOARD (NABTEB) 2015 MATHEMATICS MULTIPLE CHOICE EXAMINATION Kingsley Osamede, OMOROGIUWA (Ph.
More informationUSE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION
USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,
More informationGender-Based Differential Item Performance in English Usage Items
A C T Research Report Series 89-6 Gender-Based Differential Item Performance in English Usage Items Catherine J. Welch Allen E. Doolittle August 1989 For additional copies write: ACT Research Report Series
More informationTHE STRENGTH OF MULTIDIMENSIONAL ITEM RESPONSE THEORY IN EXPLORING CONSTRUCT SPACE THAT IS MULTIDIMENSIONAL AND CORRELATED. Steven G.
THE STRENGTH OF MULTIDIMENSIONAL ITEM RESPONSE THEORY IN EXPLORING CONSTRUCT SPACE THAT IS MULTIDIMENSIONAL AND CORRELATED by Steven G. Spencer A dissertation submitted to the faculty of Brigham Young
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationExamining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students
Examining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students Jonathan Steinberg Frederick Cline Guangming Ling Linda Cook Namrata
More informationMultidimensionality and Item Bias
Multidimensionality and Item Bias in Item Response Theory T. C. Oshima, Georgia State University M. David Miller, University of Florida This paper demonstrates empirically how item bias indexes based on
More informationThree Generations of DIF Analyses: Considering Where It Has Been, Where It Is Now, and Where It Is Going
LANGUAGE ASSESSMENT QUARTERLY, 4(2), 223 233 Copyright 2007, Lawrence Erlbaum Associates, Inc. Three Generations of DIF Analyses: Considering Where It Has Been, Where It Is Now, and Where It Is Going HLAQ
More informationProceedings of the 2011 International Conference on Teaching, Learning and Change (c) International Association for Teaching and Learning (IATEL)
EVALUATION OF MATHEMATICS ACHIEVEMENT TEST: A COMPARISON BETWEEN CLASSICAL TEST THEORY (CTT)AND ITEM RESPONSE THEORY (IRT) Eluwa, O. Idowu 1, Akubuike N. Eluwa 2 and Bekom K. Abang 3 1& 3 Dept of Educational
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report. Assessing IRT Model-Data Fit for Mixed Format Tests
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 26 for Mixed Format Tests Kyong Hee Chon Won-Chan Lee Timothy N. Ansley November 2007 The authors are grateful to
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationThe Effects Of Differential Item Functioning On Predictive Bias
University of Central Florida Electronic Theses and Dissertations Doctoral Dissertation (Open Access) The Effects Of Differential Item Functioning On Predictive Bias 2004 Damon Bryant University of Central
More informationItem Response Theory: Methods for the Analysis of Discrete Survey Response Data
Item Response Theory: Methods for the Analysis of Discrete Survey Response Data ICPSR Summer Workshop at the University of Michigan June 29, 2015 July 3, 2015 Presented by: Dr. Jonathan Templin Department
More informationAn Introduction to Missing Data in the Context of Differential Item Functioning
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute
More informationModeling DIF with the Rasch Model: The Unfortunate Combination of Mean Ability Differences and Guessing
James Madison University JMU Scholarly Commons Department of Graduate Psychology - Faculty Scholarship Department of Graduate Psychology 4-2014 Modeling DIF with the Rasch Model: The Unfortunate Combination
More informationA Modified CATSIB Procedure for Detecting Differential Item Function. on Computer-Based Tests. Johnson Ching-hong Li 1. Mark J. Gierl 1.
Running Head: A MODIFIED CATSIB PROCEDURE FOR DETECTING DIF ITEMS 1 A Modified CATSIB Procedure for Detecting Differential Item Function on Computer-Based Tests Johnson Ching-hong Li 1 Mark J. Gierl 1
More informationA Comparison of Traditional and IRT based Item Quality Criteria
A Comparison of Traditional and IRT based Item Quality Criteria Brian D. Bontempo, Ph.D. Mountain ment, Inc. Jerry Gorham, Ph.D. Pearson VUE April 7, 2006 A paper presented at the Annual Meeting of the
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationDetection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models
Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Jin Gong University of Iowa June, 2012 1 Background The Medical Council of
More informationA Bayesian Nonparametric Model Fit statistic of Item Response Models
A Bayesian Nonparametric Model Fit statistic of Item Response Models Purpose As more and more states move to use the computer adaptive test for their assessments, item response theory (IRT) has been widely
More informationComputerized Mastery Testing
Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating
More informationCenter for Advanced Studies in Measurement and Assessment. CASMA Research Report
Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 39 Evaluation of Comparability of Scores and Passing Decisions for Different Item Pools of Computerized Adaptive Examinations
More informationMaike Krannich, Odin Jost, Theresa Rohm, Ingrid Koller, Steffi Pohl, Kerstin Haberkorn, Claus H. Carstensen, Luise Fischer, and Timo Gnambs
neps Survey papers Maike Krannich, Odin Jost, Theresa Rohm, Ingrid Koller, Steffi Pohl, Kerstin Haberkorn, Claus H. Carstensen, Luise Fischer, and Timo Gnambs NEPS Technical Report for reading: Scaling
More informationA Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests
A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests David Shin Pearson Educational Measurement May 007 rr0701 Using assessment and research to promote learning Pearson Educational
More informationDecision consistency and accuracy indices for the bifactor and testlet response theory models
University of Iowa Iowa Research Online Theses and Dissertations Summer 2014 Decision consistency and accuracy indices for the bifactor and testlet response theory models Lee James LaFond University of
More informationOn indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state
On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties
More informationThe Effects of Controlling for Distributional Differences on the Mantel-Haenszel Procedure. Daniel F. Bowen. Chapel Hill 2011
The Effects of Controlling for Distributional Differences on the Mantel-Haenszel Procedure Daniel F. Bowen A thesis submitted to the faculty of the University of North Carolina at Chapel Hill in partial
More informationMultidimensional Modeling of Learning Progression-based Vertical Scales 1
Multidimensional Modeling of Learning Progression-based Vertical Scales 1 Nina Deng deng.nina@measuredprogress.org Louis Roussos roussos.louis@measuredprogress.org Lee LaFond leelafond74@gmail.com 1 This
More informationLinking Errors in Trend Estimation in Large-Scale Surveys: A Case Study
Research Report Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Xueli Xu Matthias von Davier April 2010 ETS RR-10-10 Listening. Learning. Leading. Linking Errors in Trend Estimation
More informationDifferential Performance of Test Items by Geographical Regions. Konstantin E. Augemberg Fordham University. Deanna L. Morgan The College Board
Differential Performance of Test Items by Geographical Regions Konstantin E. Augemberg Fordham University Deanna L. Morgan The College Board Paper presented at the annual meeting of the National Council
More informationMaría Verónica Santelices 1 and Mark Wilson 2
On the Relationship Between Differential Item Functioning and Item Difficulty: An Issue of Methods? Item Response Theory Approach to Differential Item Functioning Educational and Psychological Measurement
More informationRe-Examining the Role of Individual Differences in Educational Assessment
Re-Examining the Role of Individual Differences in Educational Assesent Rebecca Kopriva David Wiley Phoebe Winter University of Maryland College Park Paper presented at the Annual Conference of the National
More informationTHE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION
THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION Timothy Olsen HLM II Dr. Gagne ABSTRACT Recent advances
More informationUSING MULTIDIMENSIONAL ITEM RESPONSE THEORY TO REPORT SUBSCORES ACROSS MULTIPLE TEST FORMS. Jing-Ru Xu
USING MULTIDIMENSIONAL ITEM RESPONSE THEORY TO REPORT SUBSCORES ACROSS MULTIPLE TEST FORMS By Jing-Ru Xu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements
More informationA COMPARISON OF BAYESIAN MCMC AND MARGINAL MAXIMUM LIKELIHOOD METHODS IN ESTIMATING THE ITEM PARAMETERS FOR THE 2PL IRT MODEL
International Journal of Innovative Management, Information & Production ISME Internationalc2010 ISSN 2185-5439 Volume 1, Number 1, December 2010 PP. 81-89 A COMPARISON OF BAYESIAN MCMC AND MARGINAL MAXIMUM
More informationChapter 11 Multiple Regression
Chapter 11 Multiple Regression PSY 295 Oswald Outline The problem An example Compensatory and Noncompensatory Models More examples Multiple correlation Chapter 11 Multiple Regression 2 Cont. Outline--cont.
More informationLOGISTIC APPROXIMATIONS OF MARGINAL TRACE LINES FOR BIFACTOR ITEM RESPONSE THEORY MODELS. Brian Dale Stucky
LOGISTIC APPROXIMATIONS OF MARGINAL TRACE LINES FOR BIFACTOR ITEM RESPONSE THEORY MODELS Brian Dale Stucky A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in
More informationConnexion of Item Response Theory to Decision Making in Chess. Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan
Connexion of Item Response Theory to Decision Making in Chess Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan Acknowledgement A few Slides have been taken from the following presentation
More informationInitial Report on the Calibration of Paper and Pencil Forms UCLA/CRESST August 2015
This report describes the procedures used in obtaining parameter estimates for items appearing on the 2014-2015 Smarter Balanced Assessment Consortium (SBAC) summative paper-pencil forms. Among the items
More informationDevelopment, Standardization and Application of
American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,
More informationAdaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida
Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models
More informationCOMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS LAINE P. BRADSHAW
COMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS by LAINE P. BRADSHAW (Under the Direction of Jonathan Templin and Karen Samuelsen) ABSTRACT
More informationEFFECTS OF OUTLIER ITEM PARAMETERS ON IRT CHARACTERISTIC CURVE LINKING METHODS UNDER THE COMMON-ITEM NONEQUIVALENT GROUPS DESIGN
EFFECTS OF OUTLIER ITEM PARAMETERS ON IRT CHARACTERISTIC CURVE LINKING METHODS UNDER THE COMMON-ITEM NONEQUIVALENT GROUPS DESIGN By FRANCISCO ANDRES JIMENEZ A THESIS PRESENTED TO THE GRADUATE SCHOOL OF
More informationABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING. John Michael Clark III
ABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING BY John Michael Clark III Submitted to the graduate degree program in Psychology and
More informationAcademic Discipline DIF in an English Language Proficiency Test
Journal of English Language Teaching and Learning Year 5, No.7 Academic Discipline DIF in an English Language Proficiency Test Seyyed Mohammad Alavi Associate Professor of TEFL, University of Tehran Abbas
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationAnalyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia
Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia 1 Introduction The Teacher Test-English (TT-E) is administered by the NCA
More informationComparing DIF methods for data with dual dependency
DOI 10.1186/s40536-016-0033-3 METHODOLOGY Open Access Comparing DIF methods for data with dual dependency Ying Jin 1* and Minsoo Kang 2 *Correspondence: ying.jin@mtsu.edu 1 Department of Psychology, Middle
More informationItem-Rest Regressions, Item Response Functions, and the Relation Between Test Forms
Item-Rest Regressions, Item Response Functions, and the Relation Between Test Forms Dato N. M. de Gruijter University of Leiden John H. A. L. de Jong Dutch Institute for Educational Measurement (CITO)
More informationlinking in educational measurement: Taking differential motivation into account 1
Selecting a data collection design for linking in educational measurement: Taking differential motivation into account 1 Abstract In educational measurement, multiple test forms are often constructed to
More informationSESUG '98 Proceedings
Generating Item Responses Based on Multidimensional Item Response Theory Jeffrey D. Kromrey, Cynthia G. Parshall, Walter M. Chason, and Qing Yi University of South Florida ABSTRACT The purpose of this
More informationAn Alternative to the Trend Scoring Method for Adjusting Scoring Shifts. in Mixed-Format Tests. Xuan Tan. Sooyeon Kim. Insu Paek.
An Alternative to the Trend Scoring Method for Adjusting Scoring Shifts in Mixed-Format Tests Xuan Tan Sooyeon Kim Insu Paek Bihua Xiang ETS, Princeton, NJ Paper presented at the annual meeting of the
More informationThank You Acknowledgments
Psychometric Methods For Investigating Potential Item And Scale/Test Bias Bruno D. Zumbo, Ph.D. Professor University of British Columbia Vancouver, Canada Presented at Carleton University, Ottawa, Canada
More informationMath 124: Module 2, Part II
, Part II David Meredith Department of Mathematics San Francisco State University September 15, 2009 What we will do today 1 Explanatory and Response Variables When you study the relationship between two
More informationFighting Bias with Statistics: Detecting Gender Differences in Responses on Items on a Preschool Science Assessment
University of Miami Scholarly Repository Open Access Dissertations Electronic Theses and Dissertations 2010-08-06 Fighting Bias with Statistics: Detecting Gender Differences in Responses on Items on a
More informationMCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2
MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and Lord Equating Methods 1,2 Lisa A. Keller, Ronald K. Hambleton, Pauline Parker, Jenna Copella University of Massachusetts
More informationBlending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously
Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously Jonathan Templin Department of Educational Psychology Achievement and Assessment Institute
More informationDoes factor indeterminacy matter in multi-dimensional item response theory?
ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional
More informationA Comparison of Several Goodness-of-Fit Statistics
A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures
More informationAssessing the item response theory with covariate (IRT-C) procedure for ascertaining. differential item functioning. Louis Tay
ASSESSING DIF WITH IRT-C 1 Running head: ASSESSING DIF WITH IRT-C Assessing the item response theory with covariate (IRT-C) procedure for ascertaining differential item functioning Louis Tay University
More informationScaling TOWES and Linking to IALS
Scaling TOWES and Linking to IALS Kentaro Yamamoto and Irwin Kirsch March, 2002 In 2000, the Organization for Economic Cooperation and Development (OECD) along with Statistics Canada released Literacy
More informationUCLA UCLA Electronic Theses and Dissertations
UCLA UCLA Electronic Theses and Dissertations Title Detection of Differential Item Functioning in the Generalized Full-Information Item Bifactor Analysis Model Permalink https://escholarship.org/uc/item/3xd6z01r
More informationBayesian Tailored Testing and the Influence
Bayesian Tailored Testing and the Influence of Item Bank Characteristics Carl J. Jensema Gallaudet College Owen s (1969) Bayesian tailored testing method is introduced along with a brief review of its
More informationConstrained Multidimensional Adaptive Testing without intermixing items from different dimensions
Psychological Test and Assessment Modeling, Volume 56, 2014 (4), 348-367 Constrained Multidimensional Adaptive Testing without intermixing items from different dimensions Ulf Kroehne 1, Frank Goldhammer
More informationOn Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015
On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:
More informationUsing Analytical and Psychometric Tools in Medium- and High-Stakes Environments
Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session
More informationTHE DEVELOPMENT AND VALIDATION OF EFFECT SIZE MEASURES FOR IRT AND CFA STUDIES OF MEASUREMENT EQUIVALENCE CHRISTOPHER DAVID NYE DISSERTATION
THE DEVELOPMENT AND VALIDATION OF EFFECT SIZE MEASURES FOR IRT AND CFA STUDIES OF MEASUREMENT EQUIVALENCE BY CHRISTOPHER DAVID NYE DISSERTATION Submitted in partial fulfillment of the requirements for
More informationBrent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014
Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014 Studies under review ELA event Mathematics event Duckor, B., Castellano, K., Téllez, K., & Wilson, M. (2013, April). Validating
More information