Having your cake and eating it too: multiple dimensions and a composite
|
|
- Audrey Banks
- 5 years ago
- Views:
Transcription
1 Having your cake and eating it too: multiple dimensions and a composite Perman Gochyyev and Mark Wilson UC Berkeley BEAR Seminar October, 2018
2 outline Motivating example Different modeling approaches Composite Model Reliability Plausible values Empirical Example
3 Micro- and macro- level individual dimensions summative combination of those multiple dimensions composite three main modeling options: the uni- and multidimensional the bi-factor model the higher-order model
4 Micro- and macro- level Mathematics Achievement Algebra Geometry Statistics Administrators : What is the mathematics achievement of students? Teachers: Which topic needs closer attention?
5 Classical Test Theory
6 Item Response Theory
7 Bifactor model a serious limitation for interpretation for this context not useful for practitioners
8 Bifactor model Perhaps the methodologists who are promoting this model know some secret unknown to the authors, but we have no conceptualization what such things ( Algebra uncorrelated with Mathematics Achievement, Geometry uncorrelated with Mathematics Achievement and Statistics uncorrelated with Mathematics Achievement ) might be, and/or how they could be interpreted. (Wilson & Gochyyev, forthcoming, p.7)
9 Second-order (higher-order) model the lower order estimates are a linear function of the higher order estimate if the relationship is linear: each person has only one estimate (the higher-order one) the lower-order ones are all determined by that.
10 Composite model Assumptions The sub-test level (the parts ) are the main focus for measurement The sum-total level (the whole ) is needed for other pragmatic uses Two parts: 1. a multidimensional model for the sub-tests 2. a predictive model for a composite of the latent variables based on each sub-test
11
12 Composite model: hybrid of two measurement traditions reflective measurement dominant trend latent variable is seen as being the source of the responses to the items formative measurement items are seen as being the source of the general variable
13 Composite model Howell, Breivik & Wilcox (2007, p. 205): formative measurement is not an equally attractive alternative to reflective measurement and that whenever possible, in developing new measures or choosing among alternative existing measures, researchers should opt for reflective measurement. we agree the key: which level of the measurement should be optimized? in the educational context: level of the sub-tests should be optimized reflective measurement at the sub-test level
14 Estimation
15 Weighting Schemes Weighting by the number of items ( item-frequency weighting ) not ideal confounded by design-related decisions implicitly encoded in the unidimensional modeling approach Reliability weighting: the more reliable the score for a dimension, the higher the weight it gets affected by the number of items for that dimension
16 Weighting Schemes Weighting by mean item difficulty ( item-difficulty weighting ) if a dimension s items are more difficult, that dimension should have a higher weight in the composite one should either use a proportion correct or IRT difficulties obtained from the unidimensional model if one finds that dimensions-specific difficulty means differ substantially, this may hint towards possible design flaws as a good practice in instrument design, one should aim to have items from each dimension to span the ability continuum.
17 Weighting Schemes Weighting by intended use ( consequential weighting ) not all strands are created equally depending on the grade level, some topics/content-areas dominate the school year compared to others adjusting the weights accordingly by giving more weight to topics that are covered more might be useful for one important reason: reflecting in the test the apparent amount of a topic in the curriculum particularly relevant in educational achievement testing
18 Common scale across dimensions often overlooked regardless of how insensible it sounds justifies the combination of these dimensions into a single summary score (the composite score) option 1: construction of composite scores after aligning the different dimensions option 2: implement this alignment within an estimation routine itself dimensions will be forced into a common metric
19 Reliability of the composite
20 EAP reliability EAP: mean of the posterior distribution The variance of the posterior is used to represent uncertainty Mislevy, Beaton, Kaplan & Sheehan (1992): reliability can be viewed as the amount by which the measurement process has reduced uncertainty in the prediction of each individual s ability R E s = 1- s 2 p 2 var = s EAP ( q ) 2
21 Variance and reliability for the composite To construct this model-based variance estimate for the composite, we use plausible values (PVs: Mislevy et al, 1992) (1) randomly generate 5 PVs for each person and for each dimension (2) obtain the composite score resulting from each draw (using weights) (3) estimate the variance for each of the 5 composite distributions (4) average the variance across five draws To obtain EAP reliability divide the observed variance of the composite (obtained from dimensions-specific EAP scores) with the variance obtained from the above steps
22 Alternative reliability for the composite Reliability Coefficient (Spearman, 1910): The correlation between one half and the other half of several measures of the same thing classical formulation of reliability: correlation between two random measurements of the composite using PVs as above, obtain correlations between each pair of the 5 composite distributions, and calculate the mean of the 10 possible pairings (i.e., ((5!)/(3!2!) = 10).
23 Example: ADM Data Modeling curriculum designed to improve middle school students statistical reasoning schools were randomly assigned treatment/control pre- and post-test we used data from the posttest five sub-dimensions (domains): Data Display (DAD) Models of Variability (MOV) Chance (CHA) Concepts of Statistics (COS) Informal Inference (INI) due to the very high correlation between DAD and INI dimensions, we combined these two dimensions 25 items: DAD (11); COS (8); CHA (3); MOV (3)
24 Example: multidimensional Rasch model unidimensional Rasch: variance: (0.024) EAP reliability of 0.89; Cronbach s Alpha of 0.87.
25 Example: multidimensional Rasch model
26 Example: naïve correlations overestimated due to the correlated bivariate priors when computing EAP estimates EAP estimates are shrunken towards each other, and the amount of shrinkage depends (inversely) on their reliabilities
27 Example: Bifactor model the latent variable correlation between the common and the unidimensional latent variable is estimated at calculated using plausible values for the unidimensional latent variable, and using the reliability of the common factor to correct for the overestimation of the EAP correlations
28 Example: Bifactor model naïve correlations
29 Example: Second-order model the latent variable correlation between the common and the unidimensional latent variable is estimated at (calculated using plausible values for the unidimensional latent variable, and using the reliability of the overall factor to correct for the overestimation)
30 Example: Second-order model Correlations between latent variables Naïve correlations (between EAP estimates)
31 Example: Composite model with equal weights The latent variable correlation between the composite and the unidimensional latent variable: 0.84
32 Example: Composite model with reliability weights The latent variable correlation between the composite and the unidimensional latent variable: 0.85
33 Conclusion inherently multidimensional contexts ( the parts ) nevertheless also include a certain level of interest in the overarching combination of those multiple dimensions ( the whole ) using the uni- and multidimensional pair of modeling techniques can give both perspectives to bring them together under a single analytic umbrella, the composite model offers some very useful advantages we see it as being readily useful quite broadly to address a very long-standing measurement problem.
34 thank you questions?
Issues That Should Not Be Overlooked in the Dominance Versus Ideal Point Controversy
Industrial and Organizational Psychology, 3 (2010), 489 493. Copyright 2010 Society for Industrial and Organizational Psychology. 1754-9426/10 Issues That Should Not Be Overlooked in the Dominance Versus
More informationBasic concepts and principles of classical test theory
Basic concepts and principles of classical test theory Jan-Eric Gustafsson What is measurement? Assignment of numbers to aspects of individuals according to some rule. The aspect which is measured must
More informationAndré Cyr and Alexander Davies
Item Response Theory and Latent variable modeling for surveys with complex sampling design The case of the National Longitudinal Survey of Children and Youth in Canada Background André Cyr and Alexander
More informationBrent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014
Brent Duckor Ph.D. (SJSU) Kip Tellez, Ph.D. (UCSC) BEAR Seminar April 22, 2014 Studies under review ELA event Mathematics event Duckor, B., Castellano, K., Téllez, K., & Wilson, M. (2013, April). Validating
More informationOn Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015
On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:
More informationDiagnostic Classification Models
Diagnostic Classification Models Lecture #13 ICPSR Item Response Theory Workshop Lecture #13: 1of 86 Lecture Overview Key definitions Conceptual example Example uses of diagnostic models in education Classroom
More informationEmpowered by Psychometrics The Fundamentals of Psychometrics. Jim Wollack University of Wisconsin Madison
Empowered by Psychometrics The Fundamentals of Psychometrics Jim Wollack University of Wisconsin Madison Psycho-what? Psychometrics is the field of study concerned with the measurement of mental and psychological
More informationContents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD
Psy 427 Cal State Northridge Andrew Ainsworth, PhD Contents Item Analysis in General Classical Test Theory Item Response Theory Basics Item Response Functions Item Information Functions Invariance IRT
More informationConnexion of Item Response Theory to Decision Making in Chess. Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan
Connexion of Item Response Theory to Decision Making in Chess Presented by Tamal Biswas Research Advised by Dr. Kenneth Regan Acknowledgement A few Slides have been taken from the following presentation
More informationIntroduction to Test Theory & Historical Perspectives
Introduction to Test Theory & Historical Perspectives Measurement Methods in Psychological Research Lecture 2 02/06/2007 01/31/2006 Today s Lecture General introduction to test theory/what we will cover
More informationAnalyzing data from educational surveys: a comparison of HLM and Multilevel IRT. Amin Mousavi
Analyzing data from educational surveys: a comparison of HLM and Multilevel IRT Amin Mousavi Centre for Research in Applied Measurement and Evaluation University of Alberta Paper Presented at the 2013
More informationAnalyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia
Analyzing Teacher Professional Standards as Latent Factors of Assessment Data: The Case of Teacher Test-English in Saudi Arabia 1 Introduction The Teacher Test-English (TT-E) is administered by the NCA
More informationItem Analysis: Classical and Beyond
Item Analysis: Classical and Beyond SCROLLA Symposium Measurement Theory and Item Analysis Modified for EPE/EDP 711 by Kelly Bradley on January 8, 2013 Why is item analysis relevant? Item analysis provides
More informationMultidimensional Modeling of Learning Progression-based Vertical Scales 1
Multidimensional Modeling of Learning Progression-based Vertical Scales 1 Nina Deng deng.nina@measuredprogress.org Louis Roussos roussos.louis@measuredprogress.org Lee LaFond leelafond74@gmail.com 1 This
More informationLANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors
LANGUAGE TEST RELIABILITY On defining reliability Sources of unreliability Methods of estimating reliability Standard error of measurement Factors affecting reliability ON DEFINING RELIABILITY Non-technical
More informationEquating UDS Neuropsychological Tests: 3.0>2.0, 3.0=2.0, 3.0<2.0? Dan Mungas, Ph.D. University of California, Davis
Equating UDS Neuropsychological Tests: 3.0>2.0, 3.0=2.0, 3.0
More informationRATER EFFECTS AND ALIGNMENT 1. Modeling Rater Effects in a Formative Mathematics Alignment Study
RATER EFFECTS AND ALIGNMENT 1 Modeling Rater Effects in a Formative Mathematics Alignment Study An integrated assessment system considers the alignment of both summative and formative assessments with
More informationMeasuring mathematics anxiety: Paper 2 - Constructing and validating the measure. Rob Cavanagh Len Sparrow Curtin University
Measuring mathematics anxiety: Paper 2 - Constructing and validating the measure Rob Cavanagh Len Sparrow Curtin University R.Cavanagh@curtin.edu.au Abstract The study sought to measure mathematics anxiety
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationImpact of Methods of Scoring Omitted Responses on Achievement Gaps
Impact of Methods of Scoring Omitted Responses on Achievement Gaps Dr. Nathaniel J. S. Brown (nathaniel.js.brown@bc.edu)! Educational Research, Evaluation, and Measurement, Boston College! Dr. Dubravka
More informationThe Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing
The Use of Unidimensional Parameter Estimates of Multidimensional Items in Adaptive Testing Terry A. Ackerman University of Illinois This study investigated the effect of using multidimensional items in
More informationTurning Output of Item Response Theory Data Analysis into Graphs with R
Overview Turning Output of Item Response Theory Data Analysis into Graphs with R Motivation Importance of graphing data Graphical methods for item response theory Why R? Two examples Ching-Fan Sheu, Cheng-Te
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More informationAdjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data
Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National
More informationMeasuring and Assessing Study Quality
Measuring and Assessing Study Quality Jeff Valentine, PhD Co-Chair, Campbell Collaboration Training Group & Associate Professor, College of Education and Human Development, University of Louisville Why
More informationTechnical Specifications
Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically
More informationAuthor s response to reviews
Author s response to reviews Title: The validity of a professional competence tool for physiotherapy students in simulationbased clinical education: a Rasch analysis Authors: Belinda Judd (belinda.judd@sydney.edu.au)
More information11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES
Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are
More informationGENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS
GENERALIZABILITY AND RELIABILITY: APPROACHES FOR THROUGH-COURSE ASSESSMENTS Michael J. Kolen The University of Iowa March 2011 Commissioned by the Center for K 12 Assessment & Performance Management at
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationModels in Educational Measurement
Models in Educational Measurement Jan-Eric Gustafsson Department of Education and Special Education University of Gothenburg Background Measurement in education and psychology has increasingly come to
More informationLong Term: Systematically study children s understanding of mathematical equivalence and the ways in which it develops.
Long Term: Systematically study children s understanding of mathematical equivalence and the ways in which it develops. Short Term: Develop a valid and reliable measure of students level of understanding
More informationAnswers to end of chapter questions
Answers to end of chapter questions Chapter 1 What are the three most important characteristics of QCA as a method of data analysis? QCA is (1) systematic, (2) flexible, and (3) it reduces data. What are
More informationModule 14: Missing Data Concepts
Module 14: Missing Data Concepts Jonathan Bartlett & James Carpenter London School of Hygiene & Tropical Medicine Supported by ESRC grant RES 189-25-0103 and MRC grant G0900724 Pre-requisites Module 3
More informationMaximum Marginal Likelihood Bifactor Analysis with Estimation of the General Dimension as an Empirical Histogram
Maximum Marginal Likelihood Bifactor Analysis with Estimation of the General Dimension as an Empirical Histogram Li Cai University of California, Los Angeles Carol Woods University of Kansas 1 Outline
More informationINVESTIGATING FIT WITH THE RASCH MODEL. Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form
INVESTIGATING FIT WITH THE RASCH MODEL Benjamin Wright and Ronald Mead (1979?) Most disturbances in the measurement process can be considered a form of multidimensionality. The settings in which measurement
More informationOn the purpose of testing:
Why Evaluation & Assessment is Important Feedback to students Feedback to teachers Information to parents Information for selection and certification Information for accountability Incentives to increase
More informationANNEX A5 CHANGES IN THE ADMINISTRATION AND SCALING OF PISA 2015 AND IMPLICATIONS FOR TRENDS ANALYSES
ANNEX A5 CHANGES IN THE ADMINISTRATION AND SCALING OF PISA 2015 AND IMPLICATIONS FOR TRENDS ANALYSES Comparing science, reading and mathematics performance across PISA cycles The PISA 2006, 2009, 2012
More informationBy Hui Bian Office for Faculty Excellence
By Hui Bian Office for Faculty Excellence 1 Email: bianh@ecu.edu Phone: 328-5428 Location: 1001 Joyner Library, room 1006 Office hours: 8:00am-5:00pm, Monday-Friday 2 Educational tests and regular surveys
More information1. Evaluate the methodological quality of a study with the COSMIN checklist
Answers 1. Evaluate the methodological quality of a study with the COSMIN checklist We follow the four steps as presented in Table 9.2. Step 1: The following measurement properties are evaluated in the
More informationAnalysis of the Reliability and Validity of an Edgenuity Algebra I Quiz
Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative
More informationStatistics for Psychosocial Research Session 1: September 1 Bill
Statistics for Psychosocial Research Session 1: September 1 Bill Introduction to Staff Purpose of the Course Administration Introduction to Test Theory Statistics for Psychosocial Research Overview: a)
More informationSTATS8: Introduction to Biostatistics. Overview. Babak Shahbaba Department of Statistics, UCI
STATS8: Introduction to Biostatistics Overview Babak Shahbaba Department of Statistics, UCI The role of statistical analysis in science This course discusses some biostatistical methods, which involve
More informationItem Response Theory. Steven P. Reise University of California, U.S.A. Unidimensional IRT Models for Dichotomous Item Responses
Item Response Theory Steven P. Reise University of California, U.S.A. Item response theory (IRT), or modern measurement theory, provides alternatives to classical test theory (CTT) methods for the construction,
More informationThe Effect of Guessing on Item Reliability
The Effect of Guessing on Item Reliability under Answer-Until-Correct Scoring Michael Kane National League for Nursing, Inc. James Moloney State University of New York at Brockport The answer-until-correct
More informationMeasurement of Constructs in Psychosocial Models of Health Behavior. March 26, 2012 Neil Steers, Ph.D.
Measurement of Constructs in Psychosocial Models of Health Behavior March 26, 2012 Neil Steers, Ph.D. Importance of measurement in research testing psychosocial models Issues in measurement of psychosocial
More informationAssessing the Validity and Reliability of the Teacher Keys Effectiveness. System (TKES) and the Leader Keys Effectiveness System (LKES)
Assessing the Validity and Reliability of the Teacher Keys Effectiveness System (TKES) and the Leader Keys Effectiveness System (LKES) of the Georgia Department of Education Submitted by The Georgia Center
More informationVARIABLES AND MEASUREMENT
ARTHUR SYC 204 (EXERIMENTAL SYCHOLOGY) 16A LECTURE NOTES [01/29/16] VARIABLES AND MEASUREMENT AGE 1 Topic #3 VARIABLES AND MEASUREMENT VARIABLES Some definitions of variables include the following: 1.
More informationPlacebo and Belief Effects: Optimal Design for Randomized Trials
Placebo and Belief Effects: Optimal Design for Randomized Trials Scott Ogawa & Ken Onishi 2 Department of Economics Northwestern University Abstract The mere possibility of receiving a placebo during a
More informationIDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS
IDENTIFYING DATA CONDITIONS TO ENHANCE SUBSCALE SCORE ACCURACY BASED ON VARIOUS PSYCHOMETRIC MODELS A Dissertation Presented to The Academic Faculty by HeaWon Jun In Partial Fulfillment of the Requirements
More informationComparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria
Comparability Study of Online and Paper and Pencil Tests Using Modified Internally and Externally Matched Criteria Thakur Karkee Measurement Incorporated Dong-In Kim CTB/McGraw-Hill Kevin Fatica CTB/McGraw-Hill
More informationDoes factor indeterminacy matter in multi-dimensional item response theory?
ABSTRACT Paper 957-2017 Does factor indeterminacy matter in multi-dimensional item response theory? Chong Ho Yu, Ph.D., Azusa Pacific University This paper aims to illustrate proper applications of multi-dimensional
More informationUNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4
UNIT 4 ALGEBRA II TEMPLATE CREATED BY REGION 1 ESA UNIT 4 Algebra II Unit 4 Overview: Inferences and Conclusions from Data In this unit, students see how the visual displays and summary statistics they
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationMEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS
MEASURING MIDDLE GRADES STUDENTS UNDERSTANDING OF FORCE AND MOTION CONCEPTS: INSIGHTS INTO THE STRUCTURE OF STUDENT IDEAS The purpose of this study was to create an instrument that measures middle grades
More informationStatistical Methods and Reasoning for the Clinical Sciences
Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries
More informationChapter 1 Introduction. Measurement Theory. broadest sense and not, as it is sometimes used, as a proxy for deterministic models.
Ostini & Nering - Chapter 1 - Page 1 POLYTOMOUS ITEM RESPONSE THEORY MODELS Chapter 1 Introduction Measurement Theory Mathematical models have been found to be very useful tools in the process of human
More information(CORRELATIONAL DESIGN AND COMPARATIVE DESIGN)
UNIT 4 OTHER DESIGNS (CORRELATIONAL DESIGN AND COMPARATIVE DESIGN) Quasi Experimental Design Structure 4.0 Introduction 4.1 Objectives 4.2 Definition of Correlational Research Design 4.3 Types of Correlational
More informationReliability Theory for Total Test Scores. Measurement Methods Lecture 7 2/27/2007
Reliability Theory for Total Test Scores Measurement Methods Lecture 7 2/27/2007 Today s Class Reliability theory True score model Applications of the model Lecture 7 Psych 892 2 Great Moments in Measurement
More informationDecision consistency and accuracy indices for the bifactor and testlet response theory models
University of Iowa Iowa Research Online Theses and Dissertations Summer 2014 Decision consistency and accuracy indices for the bifactor and testlet response theory models Lee James LaFond University of
More informationRegression Discontinuity Analysis
Regression Discontinuity Analysis A researcher wants to determine whether tutoring underachieving middle school students improves their math grades. Another wonders whether providing financial aid to low-income
More informationStructural Equation Modeling (SEM)
Structural Equation Modeling (SEM) Today s topics The Big Picture of SEM What to do (and what NOT to do) when SEM breaks for you Single indicator (ASU) models Parceling indicators Using single factor scores
More informationLinking Errors in Trend Estimation in Large-Scale Surveys: A Case Study
Research Report Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study Xueli Xu Matthias von Davier April 2010 ETS RR-10-10 Listening. Learning. Leading. Linking Errors in Trend Estimation
More informationReliability, validity, and all that jazz
Reliability, validity, and all that jazz Dylan Wiliam King s College London Introduction No measuring instrument is perfect. The most obvious problems relate to reliability. If we use a thermometer to
More informationOn indirect measurement of health based on survey data. Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state
On indirect measurement of health based on survey data Responses to health related questions (items) Y 1,..,Y k A unidimensional latent health state A scaling model: P(Y 1,..,Y k ;α, ) α = item difficulties
More informationAdaptive EAP Estimation of Ability
Adaptive EAP Estimation of Ability in a Microcomputer Environment R. Darrell Bock University of Chicago Robert J. Mislevy National Opinion Research Center Expected a posteriori (EAP) estimation of ability,
More informationData and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data
TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2
More informationChapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research
Chapter 11 Nonexperimental Quantitative Research (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) Nonexperimental research is needed because
More informationConvergence Principles: Information in the Answer
Convergence Principles: Information in the Answer Sets of Some Multiple-Choice Intelligence Tests A. P. White and J. E. Zammarelli University of Durham It is hypothesized that some common multiplechoice
More informationEvaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY
2. Evaluation Model 2 Evaluation Models To understand the strengths and weaknesses of evaluation, one must keep in mind its fundamental purpose: to inform those who make decisions. The inferences drawn
More informationA Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests
A Comparison of Methods of Estimating Subscale Scores for Mixed-Format Tests David Shin Pearson Educational Measurement May 007 rr0701 Using assessment and research to promote learning Pearson Educational
More informationIn this chapter we discuss validity issues for quantitative research and for qualitative research.
Chapter 8 Validity of Research Results (Reminder: Don t forget to utilize the concept maps and study questions as you study this and the other chapters.) In this chapter we discuss validity issues for
More informationMulti-level approaches to understanding and preventing obesity: analytical challenges and new directions
Multi-level approaches to understanding and preventing obesity: analytical challenges and new directions Ana V. Diez Roux MD PhD Center for Integrative Approaches to Health Disparities University of Michigan
More information11/24/2017. Do not imply a cause-and-effect relationship
Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are highly extraverted people less afraid of rejection
More informationBayesians methods in system identification: equivalences, differences, and misunderstandings
Bayesians methods in system identification: equivalences, differences, and misunderstandings Johan Schoukens and Carl Edward Rasmussen ERNSI 217 Workshop on System Identification Lyon, September 24-27,
More informationUsing Differential Item Functioning to Test for Inter-rater Reliability in Constructed Response Items
University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations May 215 Using Differential Item Functioning to Test for Inter-rater Reliability in Constructed Response Items Tamara Beth
More informationValidating Measures of Self Control via Rasch Measurement. Jonathan Hasford Department of Marketing, University of Kentucky
Validating Measures of Self Control via Rasch Measurement Jonathan Hasford Department of Marketing, University of Kentucky Kelly D. Bradley Department of Educational Policy Studies & Evaluation, University
More informationUsing the Rasch Modeling for psychometrics examination of food security and acculturation surveys
Using the Rasch Modeling for psychometrics examination of food security and acculturation surveys Jill F. Kilanowski, PhD, APRN,CPNP Associate Professor Alpha Zeta & Mu Chi Acknowledgements Dr. Li Lin,
More informationThe application of Classical Test Theory (CTT) to the development of Patient- Reported Outcome Measures (PROMs) in Health Services Research
The application of Classical Test Theory (CTT) to the development of Patient- Reported Outcome Measures (PROMs) in Health Services Research Matthew Hankins Submission for PhD by Publication University
More informationCHAPTER - 6 STATISTICAL ANALYSIS. This chapter discusses inferential statistics, which use sample data to
CHAPTER - 6 STATISTICAL ANALYSIS 6.1 Introduction This chapter discusses inferential statistics, which use sample data to make decisions or inferences about population. Populations are group of interest
More informationMultilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison
Group-Level Diagnosis 1 N.B. Please do not cite or distribute. Multilevel IRT for group-level diagnosis Chanho Park Daniel M. Bolt University of Wisconsin-Madison Paper presented at the annual meeting
More informationGUIDELINE COMPARATORS & COMPARISONS:
GUIDELINE COMPARATORS & COMPARISONS: Direct and indirect comparisons Adapted version (2015) based on COMPARATORS & COMPARISONS: Direct and indirect comparisons - February 2013 The primary objective of
More information2013 Supervisor Survey Reliability Analysis
2013 Supervisor Survey Reliability Analysis In preparation for the submission of the Reliability Analysis for the 2013 Supervisor Survey, we wanted to revisit the purpose of this analysis. This analysis
More informationA Comparison of Several Goodness-of-Fit Statistics
A Comparison of Several Goodness-of-Fit Statistics Robert L. McKinley The University of Toledo Craig N. Mills Educational Testing Service A study was conducted to evaluate four goodnessof-fit procedures
More informationLecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method
Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to
More informationReviewing the TIMSS Advanced 2015 Achievement Item Statistics
CHAPTER 11 Reviewing the TIMSS Advanced 2015 Achievement Item Statistics Pierre Foy Michael O. Martin Ina V.S. Mullis Liqun Yin Kerry Cotter Jenny Liu The TIMSS & PIRLS conducted a review of a range of
More informationBlending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously
Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously Jonathan Templin Department of Educational Psychology Achievement and Assessment Institute
More informationTech Talk: Using the Lafayette ESS Report Generator
Raymond Nelson Included in LXSoftware is a fully featured manual score sheet that can be used with any validated comparison question test format. Included in the manual score sheet utility of LXSoftware
More informationOn the Targets of Latent Variable Model Estimation
On the Targets of Latent Variable Model Estimation Karen Bandeen-Roche Department of Biostatistics Johns Hopkins University Department of Mathematics and Statistics Miami University December 8, 2005 With
More informationMeta-Analysis. Zifei Liu. Biological and Agricultural Engineering
Meta-Analysis Zifei Liu What is a meta-analysis; why perform a metaanalysis? How a meta-analysis work some basic concepts and principles Steps of Meta-analysis Cautions on meta-analysis 2 What is Meta-analysis
More informationHow Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis?
How Does Analysis of Competing Hypotheses (ACH) Improve Intelligence Analysis? Richards J. Heuer, Jr. Version 1.2, October 16, 2005 This document is from a collection of works by Richards J. Heuer, Jr.
More informationThe Regression-Discontinuity Design
Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations
More informationTECHNICAL REPORT. The Added Value of Multidimensional IRT Models. Robert D. Gibbons, Jason C. Immekus, and R. Darrell Bock
1 TECHNICAL REPORT The Added Value of Multidimensional IRT Models Robert D. Gibbons, Jason C. Immekus, and R. Darrell Bock Center for Health Statistics, University of Illinois at Chicago Corresponding
More informationcommentary Time is a jailer: what do alpha and its alternatives tell us about reliability?
commentary Time is a jailer: what do alpha and its alternatives tell us about reliability? Rik Psychologists do not have it easy, but the article by Peters Maastricht University (2014) paves the way for
More informationDaniel Boduszek University of Huddersfield
Daniel Boduszek University of Huddersfield d.boduszek@hud.ac.uk Introduction to Correlation SPSS procedure for Pearson r Interpretation of SPSS output Presenting results Partial Correlation Correlation
More informationSWITCH Trial. A Sequential Multiple Adaptive Randomization Trial
SWITCH Trial A Sequential Multiple Adaptive Randomization Trial Background Cigarette Smoking (CDC Estimates) Morbidity Smoking caused diseases >16 million Americans (1 in 30) Mortality 480,000 deaths per
More informationCFPB Financial Well-Being Scale
May 2017 CFPB Financial Well-Being Scale Scale development technical report Table of contents Table of contents... 1 1. Introduction... 3 2. Defining financial well-being... 6 3. Overview of a typical
More informationSmiley Faces: Scales Measurement for Children Assessment
Smiley Faces: Scales Measurement for Children Assessment Wan Ahmad Jaafar Wan Yahaya and Sobihatun Nur Abdul Salam Universiti Sains Malaysia and Universiti Utara Malaysia wajwy@usm.my, sobihatun@uum.edu.my
More information