Differential Item Functioning from a Compensatory-Noncompensatory Perspective

Size: px
Start display at page:

Download "Differential Item Functioning from a Compensatory-Noncompensatory Perspective"

Transcription

1 Differential Item Functioning from a Compensatory-Noncompensatory Perspective Terry Ackerman, Bruce McCollaum, Gilbert Ngerano University of North Carolina at Greensboro

2 Motivation for my Presentation Differential Item functioning, DIF, has become a standard analysis in achievement testing. Its purpose is to insure test impartiality and identify items which are unfair, favoring one group of examinees over another. Given the high stakes surrounding many educational tests today, DIF analyses have increased in importance. DIF has the greatest potential to occur when a test is multidimensional, that is, containing items that measure, to varying degrees, superfluous, invalid skills that are different than the purported purpose of the test.

3 Motivation for my Presentation If items measure invalid skills, and examinees differ on those skills DIF is likely to result. Some examinees will end up getting those items right, not because they are competent in the skill or composite skills being measured by the test, but because they are more able on an unessential skill being measured by a DIF item.

4 Motivation for my Presentation A little strange example: me taking an algebra math test written in Turkish. Now while I should be pretty good at algebra I know close to nothing about the Turkish language. Thus, no matter how good I am at algebra, it can not compensate for my not understanding Turkish and thus my probability of correct response on these items would be very low. I would be stuck in a noncompensatory situation.

5 Motivation for my Presentation Note, that DIF cannot occur if a test is strictly unidimensional, measuring only one skill, factor, or trait. DIF is thought to occur when a test measures invalid skills and groups of examinees differ in their underlying distribution of abilities on those skills. Detecting DIF is a relatively easy process, the true challenge is determining what caused it!

6 Goal of my Presentation The purpose of my talk today is to explain a new cause of DIF. It is a situation in which a test is measuring both valid and invalid skills but groups of examinees have identical underlying ability distributions, but use the information presented in an item differently. That is for some examinees an item is compensatory, while for other examinees the item perhaps because of greater exposure of the requisite information or instructional or pedagogical differences) the item is noncompensatory. To explain this new source of DIF I m going to back up a bit and give a brief background in Multidimensional IRT modeling and DIF analyses.

7 Konuşma Akışı 1. Çok Boyutlu Madde Tepki Kuramı Modelleri: a) Tamamlayıcı b) Tamamlayıcı Olmayan 2. Çok Boyutlu Madde Tepki Kuramında Madde Gösterimi 3. Çok Boyutlu Perspektiften Değişen Madde Fonksiyonuna (DMF) Bakış 4. DMF: Tamamlayıcı İşlemlere Karşı Tamamlayıcı Olmayan İşlemler 5. Örnek Uygulamalar 6. Sonuç ve Gelecekteki Yönelimler 7. Kϋҫϋk sinav

8 1. Multidimensional IRT Models: Compensatory vs. Noncompensatory

9 The Two-dimensional Compensatory Model The probability of examinee j correctly responding to item i, can be expressed as: P ij 1.0 e a 1i 1 j a 2i 2 j d j 2 Discrimination Parameters 2 Latent abilities 1 Difficulty Parameter

10 The Two-dimensional Noncompensatory Model The probability of examinee j correctly responding to item i can be expressed as: P ij 1.0 e a 1i1 j b1 j 1. 7 a i j b e i 2 Discrimination Parameters 2 Latent abilities 2 Difficulty Parameter

11 2. Multidimensional IRT: item representation

12 Mathematica Representations

13 Contour Plot of Item Response Surface a1 = 1.50 a2 = 0.0 d = 0.3 A B C This item only discriminates between levels of 1 The steeper the surface, the more discrimination, the closer the contours.

14 Contour Plot of Item Response Surface a1 = 0.0 a2 = 0.8 d = 0.3 C This item only discriminates between levels of 2 A B The flatter the surface, the less discrimination, the further apart the contours

15 Contour Plot of Item Response Surface a1 = 1.0 a2 = 1.0 d = 0.3 A Low 1 High 2 This item discriminates between an equal composite of 1 and 2 Notice examinees with opposite ability profiles have the same probability of correct answer (i.e., compensation). B High 1 Low 2

16 Noncompensatory Model Contour Plot of Item Response Surface a1 = 1.0 a2 = 1.0 b1 = 0.0 b2 = 0.0 High 2 Low 1 A B No compensation occurs for being high on only one ability Low 1 C Low 2 High 1 Low 2

17 Perhaps the best representation of twodimensional is the vector method. Each item is represented in the latent ability plane as a vector. All vectors lie on lines that pass through the origin. Vectors can lie only in the first and third quadrants because when we estimate the a- parameters they are constrained to be positive Vectors representing easy items lie in the third quadrant; those representing difficult items lie in the first quadrant.

18 The length of the vector indicates how well an item can discriminate between levels of skill. This value is called MDISC. MDISC 2 2 a 1 a2 The tail of the vector lies on the p=.5 equiprobability contour. The signed distance from the origin to this contour is denoted as D D d MDISC The angular direction, α, indicates the composite of ability that the item is best measuring cos 1 a 1 MDISC

19 Vectors are actually projections of the direction (i.e., 1, 2 composite) of maximum discrimination or slope, onto the latent ability plane Direction of maximum slope Response surface Projected item vector

20 Contour Plot of Item Response Surface a1 = 1.8 a2 = 1.0 d = 0.8 Item response vector p =.5 equiprobability contour

21 By color coding the vectors to match different content areas we can determine Are items from a certain content area more discriminating or more difficult? Do different items from different content areas measure different ability composites? How similar are the vector profiles for different yet parallel forms?

22 Example of item vectors for the 101 item LSAT Difficult items Easy items

23 3. Differential Item Functioning from a multidimensional perspective

24 DIF Analyses DIF is examined in terms of differential performance between two identified groups, which are usually denoted as the Reference Group and the Focal Group. DIF analyses usually focus on one item at a time using conditional analyses, where intermediate statistics are calculated for each raw score category and then summed. Although there are many types of DIF analyses, for today I will focus on two dichotomously scored approaches, SIBTEST and Mantel Haenszel. At the heart of each conditional analysis is a 2 x 2 contingency table.

25

26 Mantel Haenszel DIF Statistic MH i j A j E j j j E Var A j A A j j N 1 2 Where the expected value of cell A frequency is R N N.. j 2 1. j and the variance of cell A frequencies equal N RjN FjN1. j N0. j Var A 2 N N 1.. j.. j 2x2 Contingency Table for the jth Score Category Item Score Group 1 0 Total Reference (R) A j B j N Rj Focal (F) C j D j N Fj Total N 1.j N 0.j N..j

27 n h Fh Rh h U Y Y p 0 * * ˆ ˆ n j Fh Rj Fh Rh h G G G G p 0 ˆ and G Rh and G Fh are the number of examinees in the reference and focal groups at valid score X = h. U U B U ˆ ˆ ˆ , ˆ 1, ˆ 1 ˆ ˆ ˆ n h Fh Rh k U F h Y G R h Y G p The SIBTEST test statistic is calculated as where SIBTEST DIF Statistic An estimate of the numerator of the SIBTEST test statistic is where

28 3. Differential Item Functioning from a multidimensional perspective

29 Key Ingredient in DIF analyses: the Conditioning Variable DIF occurs because the conditioning variable does not capture all of the skills (complete latent space) that the groups of examinees utilized in responding to the test items. Several studies have looked at conditioning scores and how to account for all the skills examinees have used in responding to items on a test.

30 Shin (1992) Zwick & Ercikan (1989) Condition on Skill 2 Condition on Skill 1

31 Ackerman & Evans 1994 DIF Study Generated Ability Distributions Generated Items

32 Conditioning on θ 2 Valid Skill Valid Composite Direction DIF Items Invalid Skill

33 Conditioning on θ 1 DIF Items Invalid Skill Valid Composite Direction Valid Skill

34 Conditioning on raw score DIF Items Invalid Skill DIF Items Invalid Skill

35 Conditioning on θ 1 and θ 2 All items (composites) are valid Valid Skill Valid Skill

36

37

38 4. DIF: compensatory processing versus noncompensatory processing

39 Identical Generating Distributions N = 1000 Mean Std Dev rt1t2 REF Theta Theta FOC Theta Theta

40 Vectors of Generated Items n = 30

41 Compensatory Item 13 a1 =.4 a2 =.4 d =.0 Noncompensatory Item 13 a1 = 1.2 a2 = 1.2 b1 =.0 b2 =.0

42 Compensatory Item 14 a1 =.8 a2 =.8 d =.0 Noncompensatory Item 14 a1 = 0.8 a2 = 0.8 b1 =.0 b2 =.0

43 Compensatory Item 15 a1 = 1.2 a2 = 1.2 d =.0 Noncompensatory Item 15 a1 = 1.2 a2 = 1.2 b1 =.0 b2 =.0

44 Compensatory Item 16 a1 = 1.6 a2 = 1.6 d =.0 Noncompensatory Item 16 a1 = 1.6 a2 = 1.6 b1 =.0 b2=.0

45 Raw Score Frequency Raw Score Frequency Reference Group S C O R E Focal Group 0

46 Item Reference Group Focal Group Type Compensatory Noncompensatory p-value biserial p-value biserial

47

48

49 Item 2 Item 15

50 ETS DIF Classification Categories Category A B C MH D-DIF value MH D-DIF < < MH D DIF < 1.5 MH D DIF >1.5 During Test Assembly Select freely If possible select Equivalent item with smaller MH D-DIF Select ONLY if Essential; Independent Reviewer required Action Before Score Reporting Independent reviewer required

51 A B C

52 5. Example applications

53 Situations in which compensation differences between subgroups could occur 1. Teaching Literacy Phonemic awareness Phonics Reading Fluency including oral reading skills Vocabulary Development Reading Comprehension Strategies Teacher Training - Content Knowledge vs Pedagogical Knowledge Praxis II English Language Learners

54 Situations in which compensation differences between subgroups could occur 2. English Language Learners students whose first language is not English 3. Teacher Training - Content knowledge vs pedagogical knowledge (Praxis II)

55 6. Conclusion and future directions

56 Conclusions DIF is a very perplexing analysis to perform. Quite often when we identify items that are favoring one group or another, we still can not determine what caused the DIF. Hopefully by applying multidimensional modeling we might be able to expand on why groups of students perform differentially. Such analyses, especially those involving compensation and lack of compensation could be potentially very instructive and prescriptive for teachers and help inform pedagogical practice.

57 Future Work More work needs to be done on how best to represent items in a noncompensatory framework. I am working closely with my doctoral students to look at the ways we feel DIF can occur through lack of compensation. This includes developing items that have distractors representing varying degrees of compensation. One of my students is also looking at latent class mixture models using the compensatory and noncompensatory MIRT models to identify classes of students who lack requisite skills and thus are facing noncompensatory testing scenarios.

58 Kϋҫϋk sinav

59 Being the great psychometricians that you are which group do you think this ACT item favored, Whites? Blacks? Males? Females? No DIF? BLACK EXAMINEES

60 Which group do you think this ACT item favored, Whites? Blacks? Males? Females? No DIF? A rectangular 8-inch by 10-inch picture is to be framed with a 3-inch border all the way around it. How many more square inches of wall space will be covered by the framed picture than by the picture alone? a) 24 b) 48 c) 54 d) 108 e) 144 WHITE EXAMINEES

61 For questions or comments please me at

62 References Ackerman, T. A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional. Applied Psychological Measurement, 13, Ackerman, T. A. (1992). An explanation of differential item functioning from a multidimensional perspective. Journal of Educational Measurement, 24, Ackerman, T. A. (1994). The Influence of Conditioning Scores In Performing DIF Analyses. Applied Psychological Measurement, 18, 4, Ackerman, T. A., & Evans, J. A. (1992, April). An investigation of the relationship between reliability, power, and the Type I error rate of the Mantel-Haenszel and simultaneous item bias detection procedures. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco. Ackerman, T.A. & Henson, R. A. (2014 ) Graphical representations of items and tests that are measuring multiple abilities. Proceedings of the Psychometric Society. IMPS 2013.

63 Dorans, N. J., & Holland, P. W. (1993). DIF detection and description: Mantel-Haenszel and standardization. In H. Wainer & P. W. Holland (Eds.), Differential item functioning, (pp ). Hillsdale NJ: Erlbaum. Shin, S. (1992). An empirical investigation of the robustness of the Mantel-Haenszel procedure and sources of differential item functioning. Dissertation Abstracts International, 53A, Zwick, R., & Ercikan, K. (1989). Analysis of differential item functioning in the NAEP History Assessment. Journal of Educational Measurement, 26,

64 "Teşekkürler"

The Influence of Conditioning Scores In Performing DIF Analyses

The Influence of Conditioning Scores In Performing DIF Analyses The Influence of Conditioning Scores In Performing DIF Analyses Terry A. Ackerman and John A. Evans University of Illinois The effect of the conditioning score on the results of differential item functioning

More information

The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing

The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing Terry A. Ackerman University of Illinois This study investigated the effect of using multidimensional items in

More information

Differential Item Functioning

Differential Item Functioning Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item

More information

Graphical Representation of Multidimensional

Graphical Representation of Multidimensional Graphical Representation of Multidimensional Item Response Theory Analyses Terry Ackerman University of Illinois, Champaign-Urbana This paper illustrates how graphical analyses can enhance the interpretation

More information

Section 5. Field Test Analyses

Section 5. Field Test Analyses Section 5. Field Test Analyses Following the receipt of the final scored file from Measurement Incorporated (MI), the field test analyses were completed. The analysis of the field test data can be broken

More information

International Journal of Education and Research Vol. 5 No. 5 May 2017

International Journal of Education and Research Vol. 5 No. 5 May 2017 International Journal of Education and Research Vol. 5 No. 5 May 2017 EFFECT OF SAMPLE SIZE, ABILITY DISTRIBUTION AND TEST LENGTH ON DETECTION OF DIFFERENTIAL ITEM FUNCTIONING USING MANTEL-HAENSZEL STATISTIC

More information

When can Multidimensional Item Response Theory (MIRT) Models be a Solution for. Differential Item Functioning (DIF)? A Monte Carlo Simulation Study

When can Multidimensional Item Response Theory (MIRT) Models be a Solution for. Differential Item Functioning (DIF)? A Monte Carlo Simulation Study When can Multidimensional Item Response Theory (MIRT) Models be a Solution for Differential Item Functioning (DIF)? A Monte Carlo Simulation Study Yuan-Ling Liaw A dissertation submitted in partial fulfillment

More information

Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria

Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill

More information

Noncompensatory. A Comparison Study of the Unidimensional IRT Estimation of Compensatory and. Multidimensional Item Response Data

Noncompensatory. A Comparison Study of the Unidimensional IRT Estimation of Compensatory and. Multidimensional Item Response Data A C T Research Report Series 87-12 A Comparison Study of the Unidimensional IRT Estimation of Compensatory and Noncompensatory Multidimensional Item Response Data Terry Ackerman September 1987 For additional

More information

Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD

Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT

More information

Jason L. Meyers. Ahmet Turhan. Steven J. Fitzpatrick. Pearson. Paper presented at the annual meeting of the

Jason L. Meyers. Ahmet Turhan. Steven J. Fitzpatrick. Pearson. Paper presented at the annual meeting of the Performance of Ability Estimation Methods for Writing Assessments under Conditio ns of Multidime nsionality Jason L. Meyers Ahmet Turhan Steven J. Fitzpatrick Pearson Paper presented at the annual meeting

More information

Item Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses

Item Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,

More information

Nonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia

Nonparametric DIF. Bruno D. Zumbo and Petronilla M. Witarsa University of British Columbia Nonparametric DIF Nonparametric IRT Methodology For Detecting DIF In Moderate-To-Small Scale Measurement: Operating Characteristics And A Comparison With The Mantel Haenszel Bruno D. Zumbo and Petronilla

More information

Improvements for Differential Functioning of Items and Tests (DFIT): Investigating the Addition of Reporting an Effect Size Measure and Power

Improvements for Differential Functioning of Items and Tests (DFIT): Investigating the Addition of Reporting an Effect Size Measure and Power Georgia State University ScholarWorks @ Georgia State University Educational Policy Studies Dissertations Department of Educational Policy Studies Spring 5-7-2011 Improvements for Differential Functioning

More information

Keywords: Dichotomous test, ordinal test, differential item functioning (DIF), magnitude of DIF, and test-takers. Introduction

Keywords: Dichotomous test, ordinal test, differential item functioning (DIF), magnitude of DIF, and test-takers. Introduction Comparative Analysis of Generalized Mantel Haenszel (GMH), Simultaneous Item Bias Test (SIBTEST), and Logistic Discriminant Function Analysis (LDFA) methods in detecting Differential Item Functioning (DIF)

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH

THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH THE MANTEL-HAENSZEL METHOD FOR DETECTING DIFFERENTIAL ITEM FUNCTIONING IN DICHOTOMOUSLY SCORED ITEMS: A MULTILEVEL APPROACH By JANN MARIE WISE MACINNES A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF

More information

Linking Mixed-Format Tests Using Multiple Choice Anchors. Michael E. Walker. Sooyeon Kim. ETS, Princeton, NJ

Linking Mixed-Format Tests Using Multiple Choice Anchors. Michael E. Walker. Sooyeon Kim. ETS, Princeton, NJ Linking Mixed-Format Tests Using Multiple Choice Anchors Michael E. Walker Sooyeon Kim ETS, Princeton, NJ Paper presented at the annual meeting of the American Educational Research Association (AERA) and

More information

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

Mantel-Haenszel Procedures for Detecting Differential Item Functioning A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of

More information

GMAC. Scaling Item Difficulty Estimates from Nonequivalent Groups

GMAC. Scaling Item Difficulty Estimates from Nonequivalent Groups GMAC Scaling Item Difficulty Estimates from Nonequivalent Groups Fanmin Guo, Lawrence Rudner, and Eileen Talento-Miller GMAC Research Reports RR-09-03 April 3, 2009 Abstract By placing item statistics

More information

IRT Parameter Estimates

IRT Parameter Estimates An Examination of the Characteristics of Unidimensional IRT Parameter Estimates Derived From Two-Dimensional Data Timothy N. Ansley and Robert A. Forsyth The University of Iowa The purpose of this investigation

More information

Bruno D. Zumbo, Ph.D. University of Northern British Columbia

Bruno D. Zumbo, Ph.D. University of Northern British Columbia Bruno Zumbo 1 The Effect of DIF and Impact on Classical Test Statistics: Undetected DIF and Impact, and the Reliability and Interpretability of Scores from a Language Proficiency Test Bruno D. Zumbo, Ph.D.

More information

Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses

Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses Journal of Modern Applied Statistical Methods Copyright 2005 JMASM, Inc. May, 2005, Vol. 4, No.1, 275-282 1538 9472/05/$95.00 Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement

More information

A Monte Carlo Study Investigating Missing Data, Differential Item Functioning, and Effect Size

A Monte Carlo Study Investigating Missing Data, Differential Item Functioning, and Effect Size Georgia State University ScholarWorks @ Georgia State University Educational Policy Studies Dissertations Department of Educational Policy Studies 8-12-2009 A Monte Carlo Study Investigating Missing Data,

More information

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS

GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at

More information

A DIFFERENTIAL RESPONSE FUNCTIONING FRAMEWORK FOR UNDERSTANDING ITEM, BUNDLE, AND TEST BIAS ROBERT PHILIP SIDNEY CHALMERS

A DIFFERENTIAL RESPONSE FUNCTIONING FRAMEWORK FOR UNDERSTANDING ITEM, BUNDLE, AND TEST BIAS ROBERT PHILIP SIDNEY CHALMERS A DIFFERENTIAL RESPONSE FUNCTIONING FRAMEWORK FOR UNDERSTANDING ITEM, BUNDLE, AND TEST BIAS ROBERT PHILIP SIDNEY CHALMERS A DISSERTATION SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILMENT

More information

Determining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory

Determining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory Determining Differential Item Functioning in Mathematics Word Problems Using Item Response Theory Teodora M. Salubayba St. Scholastica s College-Manila dory41@yahoo.com Abstract Mathematics word-problem

More information

Sensitivity of DFIT Tests of Measurement Invariance for Likert Data

Sensitivity of DFIT Tests of Measurement Invariance for Likert Data Meade, A. W. & Lautenschlager, G. J. (2005, April). Sensitivity of DFIT Tests of Measurement Invariance for Likert Data. Paper presented at the 20 th Annual Conference of the Society for Industrial and

More information

Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking

Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking Jee Seon Kim University of Wisconsin, Madison Paper presented at 2006 NCME Annual Meeting San Francisco, CA Correspondence

More information

The Matching Criterion Purification for Differential Item Functioning Analyses in a Large-Scale Assessment

The Matching Criterion Purification for Differential Item Functioning Analyses in a Large-Scale Assessment University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 1-2016 The Matching Criterion Purification

More information

Building Evaluation Scales for NLP using Item Response Theory

Building Evaluation Scales for NLP using Item Response Theory Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged

More information

Diagnostic Classification Models

Diagnostic Classification Models Diagnostic Classification Models Lecture #13 ICPSR Item Response Theory Workshop Lecture #13: 1of 86 Lecture Overview Key definitions Conceptual example Example uses of diagnostic models in education Classroom

More information

Published by European Centre for Research Training and Development UK (

Published by European Centre for Research Training and Development UK ( DETERMINATION OF DIFFERENTIAL ITEM FUNCTIONING BY GENDER IN THE NATIONAL BUSINESS AND TECHNICAL EXAMINATIONS BOARD (NABTEB) 2015 MATHEMATICS MULTIPLE CHOICE EXAMINATION Kingsley Osamede, OMOROGIUWA (Ph.

More information

USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION

USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION Iweka Fidelis (Ph.D) Department of Educational Psychology, Guidance and Counselling, University of Port Harcourt,

More information

Gender-Based Differential Item Performance in English Usage Items

Gender-Based Differential Item Performance in English Usage Items A C T Research Report Series 89-6 Gender-Based Differential Item Performance in English Usage Items Catherine J. Welch Allen E. Doolittle August 1989 For additional copies write: ACT Research Report Series

More information

THE STRENGTH OF MULTIDIMENSIONAL ITEM RESPONSE THEORY IN EXPLORING CONSTRUCT SPACE THAT IS MULTIDIMENSIONAL AND CORRELATED. Steven G.

THE STRENGTH OF MULTIDIMENSIONAL ITEM RESPONSE THEORY IN EXPLORING CONSTRUCT SPACE THAT IS MULTIDIMENSIONAL AND CORRELATED. Steven G. THE STRENGTH OF MULTIDIMENSIONAL ITEM RESPONSE THEORY IN EXPLORING CONSTRUCT SPACE THAT IS MULTIDIMENSIONAL AND CORRELATED by Steven G. Spencer A dissertation submitted to the faculty of Brigham Young

More information

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison

Empowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological

More information

Examining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students

Examining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students Examining the Validity and Fairness of a State Standards-Based Assessment of English-Language Arts for Deaf or Hard of Hearing Students Jonathan Steinberg Frederick Cline Guangming Ling Linda Cook Namrata

More information

Multidimensionality and Item Bias

Multidimensionality and Item Bias Multidimensionality and Item Bias in Item Response Theory T. C. Oshima, Georgia State University M. David Miller, University of Florida This paper demonstrates empirically how item bias indexes based on

More information

Three Generations of DIF Analyses: Considering Where It Has Been, Where It Is Now, and Where It Is Going

Three Generations of DIF Analyses: Considering Where It Has Been, Where It Is Now, and Where It Is Going LANGUAGE ASSESSMENT QUARTERLY, 4(2), 223 233 Copyright 2007, Lawrence Erlbaum Associates, Inc. Three Generations of DIF Analyses: Considering Where It Has Been, Where It Is Now, and Where It Is Going HLAQ

More information

Proceedings of the 2011 International Conference on Teaching, Learning and Change (c) International Association for Teaching and Learning (IATEL)

Proceedings of the 2011 International Conference on Teaching, Learning and Change (c) International Association for Teaching and Learning (IATEL) EVALUATION OF MATHEMATICS ACHIEVEMENT TEST: A COMPARISON BETWEEN CLASSICAL TEST THEORY (CTT)AND ITEM RESPONSE THEORY (IRT) Eluwa, O. Idowu 1, Akubuike N. Eluwa 2 and Bekom K. Abang 3 1& 3 Dept of Educational

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report. Assessing IRT Model-Data Fit for Mixed Format Tests

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report. Assessing IRT Model-Data Fit for Mixed Format Tests Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 26 for Mixed Format Tests Kyong Hee Chon Won-Chan Lee Timothy N. Ansley November 2007 The authors are grateful to

More information

André Cyr and Alexander Davies

André Cyr and Alexander Davies Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander

More information

The Effects Of Differential Item Functioning On Predictive Bias

The Effects Of Differential Item Functioning On Predictive Bias University of Central Florida Electronic Theses and Dissertations Doctoral Dissertation (Open Access) The Effects Of Differential Item Functioning On Predictive Bias 2004 Damon Bryant University of Central

More information

Item Response Theory: Methods for the Analysis of Discrete Survey Response Data

Item Response Theory: Methods for the Analysis of Discrete Survey Response Data Item Response Theory: Methods for the Analysis of Discrete Survey Response Data ICPSR Summer Workshop at the University of Michigan June 29, 2015 July 3, 2015 Presented by: Dr. Jonathan Templin Department

More information

An Introduction to Missing Data in the Context of Differential Item Functioning

An Introduction to Missing Data in the Context of Differential Item Functioning A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute

More information

Modeling DIF with the Rasch Model: The Unfortunate Combination of Mean Ability Differences and Guessing

Modeling DIF with the Rasch Model: The Unfortunate Combination of Mean Ability Differences and Guessing James Madison University JMU Scholarly Commons Department of Graduate Psychology - Faculty Scholarship Department of Graduate Psychology 4-2014 Modeling DIF with the Rasch Model: The Unfortunate Combination

More information

A Modified CATSIB Procedure for Detecting Differential Item Function. on Computer-Based Tests. Johnson Ching-hong Li 1. Mark J. Gierl 1.

A Modified CATSIB Procedure for Detecting Differential Item Function. on Computer-Based Tests. Johnson Ching-hong Li 1. Mark J. Gierl 1. Running Head: A MODIFIED CATSIB PROCEDURE FOR DETECTING DIF ITEMS 1 A Modified CATSIB Procedure for Detecting Differential Item Function on Computer-Based Tests Johnson Ching-hong Li 1 Mark J. Gierl 1

More information

A Comparison of Traditional and IRT based Item Quality Criteria

A Comparison of Traditional and IRT based Item Quality Criteria A Comparison of Traditional and IRT based Item Quality Criteria Brian D. Bontempo, Ph.D. Mountain ment, Inc. Jerry Gorham, Ph.D. Pearson VUE April 7, 2006 A paper presented at the Annual Meeting of the

More information

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,

More information

Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models

Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models Jin Gong University of Iowa June, 2012 1 Background The Medical Council of

More information

A Bayesian Nonparametric Model Fit statistic of Item Response Models

A Bayesian Nonparametric Model Fit statistic of Item Response Models A Bayesian Nonparametric Model Fit statistic of Item Response Models Purpose As more and more states move to use the computer adaptive test for their assessments, item response theory (IRT) has been widely

More information

Computerized Mastery Testing

Computerized Mastery Testing Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating

More information

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report Center for Advanced Studies in Measurement and Assessment CASMA Research Report Number 39 Evaluation of Comparability of Scores and Passing Decisions for Different Item Pools of Computerized Adaptive Examinations

More information

Maike Krannich, Odin Jost, Theresa Rohm, Ingrid Koller, Steffi Pohl, Kerstin Haberkorn, Claus H. Carstensen, Luise Fischer, and Timo Gnambs

Maike Krannich, Odin Jost, Theresa Rohm, Ingrid Koller, Steffi Pohl, Kerstin Haberkorn, Claus H. Carstensen, Luise Fischer, and Timo Gnambs neps Survey papers Maike Krannich, Odin Jost, Theresa Rohm, Ingrid Koller, Steffi Pohl, Kerstin Haberkorn, Claus H. Carstensen, Luise Fischer, and Timo Gnambs NEPS Technical Report for reading: Scaling

More information

A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests

A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests David Shin Pearson Educational Measurement May 007 rr0701 Using assessment and research to promote learning Pearson Educational

More information

Decision consistency and accuracy indices for the bifactor and testlet response theory models

Decision consistency and accuracy indices for the bifactor and testlet response theory models University of Iowa Iowa Research Online Theses and Dissertations Summer 2014 Decision consistency and accuracy indices for the bifactor and testlet response theory models Lee James LaFond University of

More information

On indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state

On indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties

More information

The Effects of Controlling for Distributional Differences on the Mantel-Haenszel Procedure. Daniel F. Bowen. Chapel Hill 2011

The Effects of Controlling for Distributional Differences on the Mantel-Haenszel Procedure. Daniel F. Bowen. Chapel Hill 2011 The Effects of Controlling for Distributional Differences on the Mantel-Haenszel Procedure Daniel F. Bowen A thesis submitted to the faculty of the University of North Carolina at Chapel Hill in partial

More information

Multidimensional Modeling of Learning Progression-based Vertical Scales 1

Multidimensional Modeling of Learning Progression-based Vertical Scales 1 Multidimensional Modeling of Learning Progression-based Vertical Scales 1 Nina Deng deng.nina@measuredprogress.org Louis Roussos roussos.louis@measuredprogress.org Lee LaFond leelafond74@gmail.com 1 This

More information

Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study

Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Research Report Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Xueli Xu Matthias von Davier April 2010 ETS RR-10-10 Listening. Learning. Leading. Linking Errors in Trend Estimation

More information

Differential Performance of Test Items by Geographical Regions. Konstantin E. Augemberg Fordham University. Deanna L. Morgan The College Board

Differential Performance of Test Items by Geographical Regions. Konstantin E. Augemberg Fordham University. Deanna L. Morgan The College Board Differential Performance of Test Items by Geographical Regions Konstantin E. Augemberg Fordham University Deanna L. Morgan The College Board Paper presented at the annual meeting of the National Council

More information

María Verónica Santelices 1 and Mark Wilson 2

María Verónica Santelices 1 and Mark Wilson 2 On the Relationship Between Differential Item Functioning and Item Difficulty: An Issue of Methods? Item Response Theory Approach to Differential Item Functioning Educational and Psychological Measurement

More information

Re-Examining the Role of Individual Differences in Educational Assessment

Re-Examining the Role of Individual Differences in Educational Assessment Re-Examining the Role of Individual Differences in Educational Assesent Rebecca Kopriva David Wiley Phoebe Winter University of Maryland College Park Paper presented at the Annual Conference of the National

More information

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION

THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION Timothy Olsen HLM II Dr. Gagne ABSTRACT Recent advances

More information

USING MULTIDIMENSIONAL ITEM RESPONSE THEORY TO REPORT SUBSCORES ACROSS MULTIPLE TEST FORMS. Jing-Ru Xu

USING MULTIDIMENSIONAL ITEM RESPONSE THEORY TO REPORT SUBSCORES ACROSS MULTIPLE TEST FORMS. Jing-Ru Xu USING MULTIDIMENSIONAL ITEM RESPONSE THEORY TO REPORT SUBSCORES ACROSS MULTIPLE TEST FORMS By Jing-Ru Xu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements

More information

A COMPARISON OF BAYESIAN MCMC AND MARGINAL MAXIMUM LIKELIHOOD METHODS IN ESTIMATING THE ITEM PARAMETERS FOR THE 2PL IRT MODEL

A COMPARISON OF BAYESIAN MCMC AND MARGINAL MAXIMUM LIKELIHOOD METHODS IN ESTIMATING THE ITEM PARAMETERS FOR THE 2PL IRT MODEL International Journal of Innovative Management, Information & Production ISME Internationalc2010 ISSN 2185-5439 Volume 1, Number 1, December 2010 PP. 81-89 A COMPARISON OF BAYESIAN MCMC AND MARGINAL MAXIMUM

More information

Chapter 11 Multiple Regression

Chapter 11 Multiple Regression Chapter 11 Multiple Regression PSY 295 Oswald Outline The problem An example Compensatory and Noncompensatory Models More examples Multiple correlation Chapter 11 Multiple Regression 2 Cont. Outline--cont.

More information

LOGISTIC APPROXIMATIONS OF MARGINAL TRACE LINES FOR BIFACTOR ITEM RESPONSE THEORY MODELS. Brian Dale Stucky

LOGISTIC APPROXIMATIONS OF MARGINAL TRACE LINES FOR BIFACTOR ITEM RESPONSE THEORY MODELS. Brian Dale Stucky LOGISTIC APPROXIMATIONS OF MARGINAL TRACE LINES FOR BIFACTOR ITEM RESPONSE THEORY MODELS Brian Dale Stucky A dissertation submitted to the faculty of the University of North Carolina at Chapel Hill in

More information

Connexion of Item Response Theory to Decision Making in Chess. Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan

Connexion of Item Response Theory to Decision Making in Chess. Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan Connexion of Item Response Theory to Decision Making in Chess Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan Acknowledgement A few Slides have been taken from the following presentation

More information

Initial Report on the Calibration of Paper and Pencil Forms UCLA/CRESST August 2015

Initial Report on the Calibration of Paper and Pencil Forms UCLA/CRESST August 2015 This report describes the procedures used in obtaining parameter estimates for items appearing on the 2014-2015 Smarter Balanced Assessment Consortium (SBAC) summative paper-pencil forms. Among the items

More information

Development, Standardization and Application of

Development, Standardization and Application of American Journal of Educational Research, 2018, Vol. 6, No. 3, 238-257 Available online at http://pubs.sciepub.com/education/6/3/11 Science and Education Publishing DOI:10.12691/education-6-3-11 Development,

More information

Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida

Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida Adaptive Testing With the Multi-Unidimensional Pairwise Preference Model Stephen Stark University of South Florida and Oleksandr S. Chernyshenko University of Canterbury Presented at the New CAT Models

More information

COMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS LAINE P. BRADSHAW

COMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS LAINE P. BRADSHAW COMBINING SCALING AND CLASSIFICATION: A PSYCHOMETRIC MODEL FOR SCALING ABILITY AND DIAGNOSING MISCONCEPTIONS by LAINE P. BRADSHAW (Under the Direction of Jonathan Templin and Karen Samuelsen) ABSTRACT

More information

EFFECTS OF OUTLIER ITEM PARAMETERS ON IRT CHARACTERISTIC CURVE LINKING METHODS UNDER THE COMMON-ITEM NONEQUIVALENT GROUPS DESIGN

EFFECTS OF OUTLIER ITEM PARAMETERS ON IRT CHARACTERISTIC CURVE LINKING METHODS UNDER THE COMMON-ITEM NONEQUIVALENT GROUPS DESIGN EFFECTS OF OUTLIER ITEM PARAMETERS ON IRT CHARACTERISTIC CURVE LINKING METHODS UNDER THE COMMON-ITEM NONEQUIVALENT GROUPS DESIGN By FRANCISCO ANDRES JIMENEZ A THESIS PRESENTED TO THE GRADUATE SCHOOL OF

More information

ABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING. John Michael Clark III

ABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING. John Michael Clark III ABERRANT RESPONSE PATTERNS AS A MULTIDIMENSIONAL PHENOMENON: USING FACTOR-ANALYTIC MODEL COMPARISON TO DETECT CHEATING BY John Michael Clark III Submitted to the graduate degree program in Psychology and

More information

Academic Discipline DIF in an English Language Proficiency Test

Academic Discipline DIF in an English Language Proficiency Test Journal of English Language Teaching and Learning Year 5, No.7 Academic Discipline DIF in an English Language Proficiency Test Seyyed Mohammad Alavi Associate Professor of TEFL, University of Tehran Abbas

More information

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories

Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,

More information

Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia

Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia 1 Introduction The Teacher Test-English (TT-E) is administered by the NCA

More information

Comparing DIF methods for data with dual dependency

Comparing DIF methods for data with dual dependency DOI 10.1186/s40536-016-0033-3 METHODOLOGY Open Access Comparing DIF methods for data with dual dependency Ying Jin 1* and Minsoo Kang 2 *Correspondence: ying.jin@mtsu.edu 1 Department of Psychology, Middle

More information

Item-Rest Regressions, Item Response Functions, and the Relation Between Test Forms

Item-Rest Regressions, Item Response Functions, and the Relation Between Test Forms Item-Rest Regressions, Item Response Functions, and the Relation Between Test Forms Dato N. M. de Gruijter University of Leiden John H. A. L. de Jong Dutch Institute for Educational Measurement (CITO)

More information

linking in educational measurement: Taking differential motivation into account 1

linking in educational measurement: Taking differential motivation into account 1 Selecting a data collection design for linking in educational measurement: Taking differential motivation into account 1 Abstract In educational measurement, multiple test forms are often constructed to

More information

SESUG '98 Proceedings

SESUG '98 Proceedings Generating Item Responses Based on Multidimensional Item Response Theory Jeffrey D. Kromrey, Cynthia G. Parshall, Walter M. Chason, and Qing Yi University of South Florida ABSTRACT The purpose of this

More information

An Alternative to the Trend Scoring Method for Adjusting Scoring Shifts. in Mixed-Format Tests. Xuan Tan. Sooyeon Kim. Insu Paek.

An Alternative to the Trend Scoring Method for Adjusting Scoring Shifts. in Mixed-Format Tests. Xuan Tan. Sooyeon Kim. Insu Paek. An Alternative to the Trend Scoring Method for Adjusting Scoring Shifts in Mixed-Format Tests Xuan Tan Sooyeon Kim Insu Paek Bihua Xiang ETS, Princeton, NJ Paper presented at the annual meeting of the

More information

Thank You Acknowledgments

Thank You Acknowledgments Psychometric Methods For Investigating Potential Item And Scale/Test Bias Bruno D. Zumbo, Ph.D. Professor University of British Columbia Vancouver, Canada Presented at Carleton University, Ottawa, Canada

More information

Math 124: Module 2, Part II

Math 124: Module 2, Part II , Part II David Meredith Department of Mathematics San Francisco State University September 15, 2009 What we will do today 1 Explanatory and Response Variables When you study the relationship between two

More information

Fighting Bias with Statistics: Detecting Gender Differences in Responses on Items on a Preschool Science Assessment

Fighting Bias with Statistics: Detecting Gender Differences in Responses on Items on a Preschool Science Assessment University of Miami Scholarly Repository Open Access Dissertations Electronic Theses and Dissertations 2010-08-06 Fighting Bias with Statistics: Detecting Gender Differences in Responses on Items on a

More information

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2 MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and Lord Equating Methods 1,2 Lisa A. Keller, Ronald K. Hambleton, Pauline Parker, Jenna Copella University of Massachusetts

More information

Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously

Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously Jonathan Templin Department of Educational Psychology Achievement and Assessment Institute

More information

Does factor indeterminacy matter in multi-dimensional item response theory?

Does factor indeterminacy matter in multi-dimensional item response theory? ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional

More information

A Comparison of Several Goodness-of-Fit Statistics

A Comparison of Several Goodness-of-Fit Statistics A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures

More information

Assessing the item response theory with covariate (IRT-C) procedure for ascertaining. differential item functioning. Louis Tay

Assessing the item response theory with covariate (IRT-C) procedure for ascertaining. differential item functioning. Louis Tay ASSESSING DIF WITH IRT-C 1 Running head: ASSESSING DIF WITH IRT-C Assessing the item response theory with covariate (IRT-C) procedure for ascertaining differential item functioning Louis Tay University

More information

Scaling TOWES and Linking to IALS

Scaling TOWES and Linking to IALS Scaling TOWES and Linking to IALS Kentaro Yamamoto and Irwin Kirsch March, 2002 In 2000, the Organization for Economic Cooperation and Development (OECD) along with Statistics Canada released Literacy

More information

UCLA UCLA Electronic Theses and Dissertations

UCLA UCLA Electronic Theses and Dissertations UCLA UCLA Electronic Theses and Dissertations Title Detection of Differential Item Functioning in the Generalized Full-Information Item Bifactor Analysis Model Permalink https://escholarship.org/uc/item/3xd6z01r

More information

Bayesian Tailored Testing and the Influence

Bayesian Tailored Testing and the Influence Bayesian Tailored Testing and the Influence of Item Bank Characteristics Carl J. Jensema Gallaudet College Owen s (1969) Bayesian tailored testing method is introduced along with a brief review of its

More information

Constrained Multidimensional Adaptive Testing without intermixing items from different dimensions

Constrained Multidimensional Adaptive Testing without intermixing items from different dimensions Psychological Test and Assessment Modeling, Volume 56, 2014 (4), 348-367 Constrained Multidimensional Adaptive Testing without intermixing items from different dimensions Ulf Kroehne 1, Frank Goldhammer

More information

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015 On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:

More information

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments

Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Using Analytical and Psychometric Tools in Medium- and High-Stakes Environments Greg Pope, Analytics and Psychometrics Manager 2008 Users Conference San Antonio Introduction and purpose of this session

More information

THE DEVELOPMENT AND VALIDATION OF EFFECT SIZE MEASURES FOR IRT AND CFA STUDIES OF MEASUREMENT EQUIVALENCE CHRISTOPHER DAVID NYE DISSERTATION

THE DEVELOPMENT AND VALIDATION OF EFFECT SIZE MEASURES FOR IRT AND CFA STUDIES OF MEASUREMENT EQUIVALENCE CHRISTOPHER DAVID NYE DISSERTATION THE DEVELOPMENT AND VALIDATION OF EFFECT SIZE MEASURES FOR IRT AND CFA STUDIES OF MEASUREMENT EQUIVALENCE BY CHRISTOPHER DAVID NYE DISSERTATION Submitted in partial fulfillment of the requirements for

More information

Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014

Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014 Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014 Studies under review ELA event Mathematics event Duckor, B., Castellano, K., Téllez, K., & Wilson, M. (2013, April). Validating

More information