Impact and adjustment of selection bias. in the assessment of measurement equivalence
|
|
- Winifred Crawford
- 5 years ago
- Views:
Transcription
1 Impact and adjustment of selection bias in the assessment of measurement equivalence Thomas Klausch, Joop Hox,& Barry Schouten Working Paper, Utrecht, December 2012 Corresponding author: Thomas Klausch, Utrecht University Faculty for Social and Behavioural Sciences Department of Methods and Statistics PO Box 80140, 3508 TC Utrecht The Netherlands. Abstract Selection bias is a threat to valid causal inference in designs with incomplete randomization, e.g. in observational studies and quasi-experiments. When measurement models, such as CFA or IRT, need to be assessed for equivalence selection bias might lead analysts to draw wrong conclusions. Whether this threat is real and how to adjust for it, is assessed in the present study by means of a Monte-Carlo simulation. Selection bias between a treatment and control group was simulated, where measurement non-equivalence was introduced on qualitative covariates that were causally related to the assignment mechanism. Our results indicate that unadjusted tests falsely reject the hypothesis of measurement equivalence using RMSEA and CFI fit criterions. Inverse propensity score weighting performed best in adjustment, whereas simple ANCOVA adjustment on the latent factor proved insufficient in removing all selectivity in the treatment assignment. 1
2 1. Introduction Latent variables are important quantities in social research assisting researchers in measuring concepts that cannot be surveyed by single direct questions alone. Measurement models, such as confirmatory factor analysis (CFA) or item response theory (IRT), are used in the estimation of latent variables and additionally help to control for measurement error in the observable indicators (e.g. Bollen, 1989; Alwin, 2007). These are typically available from multiple item scales. Multiple group models can be used to assess construct equivalence (equivalence) across study groups about which analysts wish to draw inference or make comparisons, for example population strata defined by gender, age, nationality or race. The methods necessary to assess such question have been well developed and documented (Steenkamp & Baumgartner, 1998; Vandenberg & Lance, 2000; Millsap, 2011). Not always, however, lies the focus of interest in comparing naturally occurring population strata. When the researchers seek to draw causal inference about latent constructs in experimental designs or the effect of an experimentally manipulated factor on other parameters of the measurement model, it needs to be assessed whether measurement instruments are invariant across experimental groups. If randomization was successful, these analyses will be unbiased. In many situations, however, full randomization of subjects to treatment ( intervention ) and control groups is not possible, in particular in observational and quasiexperimental studies (Rosenbaum, 2002; Morgan & Winship, 2007). In such situations selection bias is known to occur threatening valid causal inference about equivalence in measurement models. Although selection bias has received a lot of attention in the literature including the methods to adjust for it (cf. Schafer & Kang, 2008 ), the effect selection bias has in latent variable models and on equivalence assessment, to the authors knowledge, has not been discussed. Furthermore, adjusting for selection bias in latent variable models has received little systematic attention in the literature. The present study addresses this research gap. Using a Monte-Carlo simulation, we illustrate the effects that selection bias can have in categorical CFA measurement models (also known as polytomous IRT models) when testing measurement equivalence (Muthén& Asparouhov, 2002; Millsap, 2011). This class of models is appropriate when questions using Likert scales with a small number of answer categories need to be scaled, a situation close to research practice. Consequently, we compare the performance of methods to adjust for selection bias. In particular, we consider three popular methods used to adjust selection bias, when directly observed outcome variables are studied: ANCOVA adjustment on covariates as suggested by Sörbom (1978), exact stratification (Rosenbaum, 2002), and propensity score weighting (Rosenbaum& Rubin, 1983). 2
3 This paper is structured as follows. First, we present the CFA model used in the simulation and discuss how selection bias is introduced. Second, we discuss how to adjust for selection bias. Finally, results are presented and discussed. 2. Simulating Selection bias in an ordered categorical CFA Model 2.1 The CFA Model In the simulation we consider the ordered categorical factor model with, for simplicity, one factor and four indicators (cf. Muthén & Asparouhov, 2002; Millsap, 2011): (1) We model j=1,,4 latent response variables X* by variable specific intercepts, a source W of common variance ( true or latent scores) and a random error. For identification we, however, fix 0 for all j. We further set ~0,1. Unit variance of W leads to the definition of the reliability of measure j: ² 1 (2) The second equation follows after standardizing by 1. Accordingly, error variances dependend on : θ 1" ² (3) All cannot be observed directly, but are mapped without error on the observed ordered categorical indicator with C=4 defining five categories by the mapping function: # $ #% & ' % &() (4) Where % & are denoted threshold parameters for the latent response variable j. 2.2 Introducing Covariates as Causes to Selection Bias We introduce selection bias by two multinominal stratification variables, * ) and * +, dividing the sample space into 3 x 2 strata (population proportions are {.3,.3,.4} and {.5,.5} respectively). In practice many more variables might be available to the analyst, but the results will easily generalize to other situations. Note that assignment to treatment and control groups has not taken place. However, in observational studies probability of assignment is impacted by the levels of covariates, which will be represented by * ) and * +. 3
4 In the simulation we vary reliability and threshold parameters across these two stratification variables. By this process we account for the fact that in reality not the experimental factor causes measurement non-equivalence, but rather the underlying characteristics S. For example, S might be the nationality of subjects, but S is not equally distributed across treatment and control due to selectivity. This imbalance will be introduced below. First, let membership in stratum combination * ), and * + - have a fixed effect on response reliability: /,0 + ² / ² 0 ² 1 (5) As well as the threshold parameters: % & /,0 % & % /& % 0& (6) The order implied by equation (4) must still hold for % & /,0. Consider table 1 and 2 for an overview on the exact parameterization used, which introduces measurement non-equivalence between all strata. Equations 5 and 6 generalize model 1 to a multi-group CFA model: /,0 /,0 /,0 (7) # $ * ),,* + - #% & /,0 ' % &() /,0 (8) Where we additionally assume that W is independent of * ) and * +. Table 1: Differential item functioning (reliabilities) in sub-groups * ) and * + j ² /2) ² /2+ ² /23 ² 02) ² 02+ ² Table 2: Differential item functioning (thresholds) in sub-groups * ) and * + c % & %,/2),& %,/2+,& %,/23,& %,02),& %,02+,&
5 For example, if S 2 is a nationality indicator, we would assume that the reliability of answers given by people with nationality l=a is smaller by.20 than the reliability of answers provided by subjects with nationality l=b. Furthermore the threshold at which a particular answer would be given is also different across groups. Additionally to this, it might be possible that the factor means (and variances) of subjects differ across strata. This possibility is neglected in order to keep the present simulation straight-forward. 2.3 Introducing selection bias on S when randomizing treatment and control groups Now let population members select into a treatment and a control group using a simple probit selection model. For this purpose we transform * ) and * + into dummy indicators: 45 )), 5 )+, 5 )3 6 and 45 +), For individual i we define a latent selection variable (see also Table 3): : 9 )+ 5 )+8 9 )3 5 ) ) 5 ) ) ; 8 (9) Where ;~0,1 and define the treatment indicator M as: < 8 = 0 >? >? 7 0. (10) Model (9)-(10) suggests that we do not assume that randomization was perfect. Rather background characteristics S are the known causes of selection bias. Table 3: Parameters of the selection model Parameter Value 9 : 0 9 )+ 1 9 ) ) Ana analyst interested in assessing measurement equivalence over M can do so by means of a multi-group model (e.g. Millsap & Yun-Tein, 2004; Millsap, 2011): A A A (11) # $ < #% & A ' % &() A (12) 5
6 In doing so, the hypotheses: B :) : A2: A2) for all j (13) B :+ : % & A2: % & A2) for all j and c (14) are evaluated jointly by imposing equality constraints on the parameters and evaluating global model fit. Model (11)-(12) can be estimated by mean and variance adjusted weighted least squares (WLSMV) as described in Muthén (1984) and Muthén, du Toit, & Spisic (1997). Treatment indicator M has no differential impact on the measurement model. That is why B :) and B :+ should not be rejected. However, selection on * ) and * + might introduce measurement non-equivalence, because the distribution of the selection variables is not equal across modes by means of selection model (9)-(10). Hence it is asked, if the test of measurement equivalence (13)-(14) can be improved (or adjusted) by applying techniques that balance mode groups with respect to * ) and * +. How to do this, is discussed in the next section. 3. Evaluation of three adjustment methods We evaluate performance of three possible adjustment methods: 1. ANCOVA adjustment of the latent factor ( Covariate adjustment ), 2. Exact Stratification on * ) and * +, 3. Inverse propensity score weighting ( IPW ), against the case of ignoring selection ( Simple model ). (a) (b) S 1 T T S 2 Figure 1: Illustration of (a) a path model in the ANCOVA tradition (b) exact stratification on all levels of S1 by S2 by a stratified multi-group model 6
7 Situations 1 and 2 are illustrated in figure 1. ANCOVA adjustment is a classical way to control for group heterogeneity in incompletely randomized groups (e.g. Schafer & Kang, 2008). In the context of CFA models modeling direct effects of the stratification variables on the latent factor (method 1, case a in figure 1) seeks to balance group heterogeneity, as suggested by Sörbom (1978; e.g. Heerwegh & Loosveldt, 2011 for an application): 8 D : D )+ 5 )+8 D )3 5 )38 D D 3) 5 ) D 3+ 5 ) E 8 (15) Second, model stratification on the exact strata defined by * ) and * + (method 2, case b in figure 1) implies conditional estimation of all multiple-group model parameters on any combinations of S (e.g. Rosenbaum, 2002; Morgan & Winship, 2007). In the present simulation this implies to estimate one multiple-group model for each of s=1,,6 strata defined by * ) and * + : A,F2G A,F2G A,F2G (16) # $ <,* 5 #% & A,F2G ' % &() A,F2G (17) It is concluded that measurement equivalence holds conditional on * ) and * + if B :) and B :+ cannot be rejected in all strata defined by the combinations of both S variables. Finally, inverse propensity score weighting (method 3) can be used to adjust for unequal selection probabilities of individual i to treatment (or control). Propensity scored are estimated from a probit model that follows the true model (9) (Rosenbaum & Rubin, 1982; Morgan & Winship, 2007; Guo & Fraser, 2010): #< 8 1 * )8,* +8 Φ9 : 9 )+ 5 )+8 9 )3 5 ) ) 5 ) ) (18) Where Φ denotes the standard normal distribution function. Let IJ 8 be a propensity score estimate from (18), then individual weights are defined as: KL 8 M IJ 8 N) >? < 8 1 1"IJ 8 N) >? < 8 0 < 8IJ 8 N) 1"< 8 1"IJ 8 N) (19) Implementation of selection probabilities to the estimation of model (11)-(12) with WLSMV is described in Asparouhov (2005). 4. Results from a Monte-Carlo Simulation A Monte-Carlo simulation with 1000 replications and a sample size of n=3000 was conducted. Data were simulated in the statistical programming environment R Models were estimated in the statistical software Mplus 6 run from R using the procedure MplusAutomation. To 7
8 evaluate B :) and B :+ jointly, the parameters A,% & A,and θ A were constrained equal across M. Fit was evaluated based on RMSEA criterion (root means square error approximated): Reject B :) and B :+ if (20) and CFI criterion (comparative fit index): Reject B :) and B :+ if TUV '.95 (21) We found that all models in the unadjusted ( simple ) model have insufficient fit leading to false rejection of the measurement equivalence (MI) hypothesis with respect to mode groups M in all of the replications (Table 4). Covariate adjustment on the latent factor improves model fit (mean RMSEA=.053, CFI=.916) but still leads to rejection of the MI hypotheses in 71.6% of cases based on RMSEA and all of the cases based on the CFI criterion. Exact stratification on all six strata and separate evaluation of multi-group models in each of the strata is more effective than covariate adjustment. However, during estimation the conditioning technique posed new problems due to data sparseness in some of the mode group strata combinations. Table 4: MC distribution of CFA model fit statistics with results of hypothesis tests (in %) Simple Covariate Stratification IPW RMSEA (mean/sd).129(.008).053(.004).014(.018)*.013(.009) % MI not rejected CFI (mean/sd).864(.017).916(.014).994(.031)*.993(.007) % MI not rejected Successful estimations of * over all successful replications To see this, consider table 5 that illustrates performance of hypotheses tests in all six strata. The selection model (9) parameterization evidently causes mode distribution in stratum * ) 3 to be very skew (9 )3 2, for example (i.e. few observation in M=0). While the adjustment method works well in strata with a high number of successful replications (i.e. those with sufficient group sizes), it functions badly in the two strata associated with * ) 3. It was, furthermore, postulated (cf. section 3) that successful adjustment for selection would only be considered successful if equivalence was produced in all six strata. This was, however, only found for 83.8% of replications based on RMSEA and a mere 0.5% of replication based on CFI (only successful model fits were used in these two statistics). This suggests a multiple testing problem, because taken separately for all strata (table 5) the conditioning technique work 8
9 satisfactory, if strata are of sufficient size. In sum, the conditioning technique may suffer from data sparseness and multiple testing problems. Table 5: Fit statistics per strata for the stratification adjustment method (in %) * ) 1, * + 1 * ) 2, * + 1 * ) 3, * + 1 * ) 1, * + 2 * ) 2, * + 2 * ) 3, * + 2 RMSEA (mean/sd) (.018) (.018). (.018) (.020) (.018) n/a % MI not rejected CFI (mean/sd) (.003) (.020) (.151) (.001) (.006) (n/a) % MI not rejected Successful estimations Finally, consider the performance of the inverse propensity score (IPW) adjustment technique (Table 4). Mean RMSEA and CFI suggest good fit. Measurement equivalence is not rejected in any of the replications; that is, RMSEA<.05 and CFI>.95 in all replications after weighting with inverse propensity scores. Note that this finding holds in face of small strata sizes as discussed for the conditioning adjustment technique. Since IPW, furthermore, only requires one statistical test, it appears to be superior to stratification in the current simulation. 5. Conclusions and Outlook Our simulation demonstrated that working with non-adjusted CFA models when testing for measurement equivalence in experimental groups is prone to false conclusions under two conditions. First, observed or unobserved covariates determine individual selection into treatment and control groups. Second, there is measurement non-equivalence across the classes of these variables. In the presence of selection bias in measurement equivalence tests, adjusting for bias on observed covariates is a necessity. Our results demonstrate, however, that not any of the methods available from the literature performs equally well. In particular, an ANCOVA adjustment on the latent trait performed very weakly in the present simulation and therefore is not recommended. The reasons for this weak performance are related to the locus of non-equivalence in the present data. We assumed that strata of the stratification variables did not differ on the means and variances of the latent factor, but rather with respect to thresholds and item reliabilities. The ANCOVA adjustment, however, works only on the expectation and variance on the latent factor not controlling for the true sources of non-equivalence. These are taken into account using exact stratification and propensity score weighting. Given the sample size of the present simulation (n=3000) and six strata, cell sparseness in a few stratums 9
10 coincided with false test results. In situations with even more stratification variables or less observations, this problem is prone to become even more serious. Therefore exact stratification is not recommended in these situations. This observation is generally known from the literature on adjustment by exact stratification (e.g. Rosenbaum, 2002; Morgan & Winship, 2007). The propensity score combines information on all stratification variables into a single vector, thereby addressing cell sparseness problems effectively. Consequently, inverse propensity weights performed exceedingly well in adjusting for selection bias in the present simulation. From the present results we conclude that IPW is the method of choice. This conclusion has to be considered against the specific limitations of the present simulation design (e.g. parameterization, sample size, number of stratification variables) as well as further options to adjust for selection bias, which were not considered. These include in particular methods based on the propensity score, such as propensity matching and stratification. Furthermore, the present study assumed that all bias is overt, that is, full information was available on both stratification covariates. In practical situations, there might be bias caused by hidden covariates. Propensity score models have shown to be misleading in this situation. Double robust methods using both covariate and propensity adjustment might prove beneficial. In further simulations, these aspects should be assessed. 10
11 6. References Alwin, D. F. (2007). Margins of Error. Hoboken: Wiley. Asparouhov, T. (2005). Sampling Weights in Latent Variable Modeling. Structural Equation Modeling: A Multidisciplinary Journal, 12(3), doi: /s sem1203_4 Bollen, K. A. (1989). Structural Equations with Latent Variables. New York: Wiley. Guo, S., & Fraser, M. W. (2010). Propensity Score Analysis. Thousand Oaks: Sage. Heerwegh, D., & Loosveldt, G. (2011). Assessing Mode Effects in a National Crime Victimization Survey using Structural Equation Models: Social Desirability Bias and Acquiescence. Journal of Official Statistics, 27(1), Millsap, R. E. (2011). Statistical Approaches to Measurement Equivalence. New York: Routledge. Millsap, R. E., & Yun-Tein, J. (2004). Assessing Factorial Equivalence in Ordered- Categorical Measures. Multivariate Behavioral Research, 39(3), Morgan, S. L., & Winship, C. (2007). Counterfactuals and Causal Inference. Cambridge: Cambridge University Press. Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49(1), Muthén, B., Du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Retrieved from 11
12 Muthén, B. O., & Asparouhov, T. (2002). Latent Variable Analysis With Categorical Outcomes: Multiple-Group And Growth Modeling In Mplus. Muthèn & Muthèn. Retrieved from Rosenbaum, P. R. (2002). Observational Studies (2nd ed.). New York: Springer. Rosenbaum, P. R., & Rubin, D. B. (1983). The Central Role of the Propensity Score in Observational Studies for Causal Effects. Biometrika, 70(1), Schafer, J. L., & Kang, J. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), Sörbom, D. (1978). An alternative to the methodology for analysis of covariance. Psychometrika, 43(3), Steenkamp, J.-B. E. M., & Baumgartner, H. (1998). Assessing Measurement Equivalence in Cross-National Consumer Research. Journal of Consumer Research, 25, Vandenberg, R. J., & Lance, C. E. (2000). A Review and Synthesis of the Measurement Equivalence Literature: Suggestions, Practices, and Recommendations for Organizational Research. Organizational Research Methods, 3(1),
Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses
Journal of Modern Applied Statistical Methods Copyright 2005 JMASM, Inc. May, 2005, Vol. 4, No.1, 275-282 1538 9472/05/$95.00 Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement
More informationHow few countries will do? Comparative survey analysis from a Bayesian perspective
Survey Research Methods (2012) Vol.6, No.2, pp. 87-93 ISSN 1864-3361 http://www.surveymethods.org European Survey Research Association How few countries will do? Comparative survey analysis from a Bayesian
More informationAssessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies. Xiaowen Zhu. Xi an Jiaotong University.
Running head: ASSESS MEASUREMENT INVARIANCE Assessing Measurement Invariance in the Attitude to Marriage Scale across East Asian Societies Xiaowen Zhu Xi an Jiaotong University Yanjie Bian Xi an Jiaotong
More informationA methodological perspective on the analysis of clinical and personality questionnaires Smits, Iris Anna Marije
University of Groningen A methodological perspective on the analysis of clinical and personality questionnaires Smits, Iris Anna Mare IMPORTANT NOTE: You are advised to consult the publisher's version
More informationAlternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research
Alternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research Michael T. Willoughby, B.S. & Patrick J. Curran, Ph.D. Duke University Abstract Structural Equation Modeling
More informationOn the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA
STRUCTURAL EQUATION MODELING, 13(2), 186 203 Copyright 2006, Lawrence Erlbaum Associates, Inc. On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation
More informationChapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy
Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy Jee-Seon Kim and Peter M. Steiner Abstract Despite their appeal, randomized experiments cannot
More informationMeasurement Invariance (MI): a general overview
Measurement Invariance (MI): a general overview Eric Duku Offord Centre for Child Studies 21 January 2015 Plan Background What is Measurement Invariance Methodology to test MI Challenges with post-hoc
More informationPropensity Score Analysis Shenyang Guo, Ph.D.
Propensity Score Analysis Shenyang Guo, Ph.D. Upcoming Seminar: April 7-8, 2017, Philadelphia, Pennsylvania Propensity Score Analysis 1. Overview 1.1 Observational studies and challenges 1.2 Why and when
More informationEstimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling
STATISTICS IN MEDICINE Statist. Med. 2009; 28:3363 3385 Published online 3 September 2009 in Wiley InterScience (www.interscience.wiley.com).3721 Estimating drug effects in the presence of placebo response:
More informationTo link to this article:
This article was downloaded by: [Vrije Universiteit Amsterdam] On: 06 March 2012, At: 19:03 Publisher: Psychology Press Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
More informationComparing Factor Loadings in Exploratory Factor Analysis: A New Randomization Test
Journal of Modern Applied Statistical Methods Volume 7 Issue 2 Article 3 11-1-2008 Comparing Factor Loadings in Exploratory Factor Analysis: A New Randomization Test W. Holmes Finch Ball State University,
More informationSurvey Sampling Weights and Item Response Parameter Estimation
Survey Sampling Weights and Item Response Parameter Estimation Spring 2014 Survey Methodology Simmons School of Education and Human Development Center on Research & Evaluation Paul Yovanoff, Ph.D. Department
More informationA COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY
A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,
More informationUsing Sample Weights in Item Response Data Analysis Under Complex Sample Designs
Using Sample Weights in Item Response Data Analysis Under Complex Sample Designs Xiaying Zheng and Ji Seung Yang Abstract Large-scale assessments are often conducted using complex sampling designs that
More informationMeasurement Equivalence of Ordinal Items: A Comparison of Factor. Analytic, Item Response Theory, and Latent Class Approaches.
Measurement Equivalence of Ordinal Items: A Comparison of Factor Analytic, Item Response Theory, and Latent Class Approaches Miloš Kankaraš *, Jeroen K. Vermunt* and Guy Moors* Abstract Three distinctive
More informationConfirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children
Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children Dr. KAMALPREET RAKHRA MD MPH PhD(Candidate) No conflict of interest Child Behavioural Check
More informationPropensity Score Methods for Causal Inference with the PSMATCH Procedure
Paper SAS332-2017 Propensity Score Methods for Causal Inference with the PSMATCH Procedure Yang Yuan, Yiu-Fai Yung, and Maura Stokes, SAS Institute Inc. Abstract In a randomized study, subjects are randomly
More informationMoving beyond regression toward causality:
Moving beyond regression toward causality: INTRODUCING ADVANCED STATISTICAL METHODS TO ADVANCE SEXUAL VIOLENCE RESEARCH Regine Haardörfer, Ph.D. Emory University rhaardo@emory.edu OR Regine.Haardoerfer@Emory.edu
More informationScale Building with Confirmatory Factor Analysis
Scale Building with Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #7 February 27, 2013 PSYC 948: Lecture #7 Today s Class Scale building with confirmatory
More informationVersion No. 7 Date: July Please send comments or suggestions on this glossary to
Impact Evaluation Glossary Version No. 7 Date: July 2012 Please send comments or suggestions on this glossary to 3ie@3ieimpact.org. Recommended citation: 3ie (2012) 3ie impact evaluation glossary. International
More informationPaul Irwing, Manchester Business School
Paul Irwing, Manchester Business School Factor analysis has been the prime statistical technique for the development of structural theories in social science, such as the hierarchical factor model of human
More informationJumpstart Mplus 5. Data that are skewed, incomplete or categorical. Arielle Bonneville-Roussy Dr Gabriela Roman
Jumpstart Mplus 5. Data that are skewed, incomplete or categorical Arielle Bonneville-Roussy Dr Gabriela Roman Questions How do I deal with missing values? How do I deal with non normal data? How do I
More informationA Bayesian Nonparametric Model Fit statistic of Item Response Models
A Bayesian Nonparametric Model Fit statistic of Item Response Models Purpose As more and more states move to use the computer adaptive test for their assessments, item response theory (IRT) has been widely
More informationMethods for Computing Missing Item Response in Psychometric Scale Construction
American Journal of Biostatistics Original Research Paper Methods for Computing Missing Item Response in Psychometric Scale Construction Ohidul Islam Siddiqui Institute of Statistical Research and Training
More informationMediation Analysis With Principal Stratification
University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 3-30-009 Mediation Analysis With Principal Stratification Robert Gallop Dylan S. Small University of Pennsylvania
More informationPropensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research
2012 CCPRC Meeting Methodology Presession Workshop October 23, 2012, 2:00-5:00 p.m. Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy
More informationResearch Brief Reliability of the Static Risk Offender Need Guide for Recidivism (STRONG-R)
Research Brief Reliability of the Static Risk Offender Need Guide for Recidivism (STRONG-R) Xiaohan Mei, M.A. Zachary Hamilton, Ph.D. Washington State University 1 Reliability/Internal Consistency of STRONG-R
More informationSensitivity of DFIT Tests of Measurement Invariance for Likert Data
Meade, A. W. & Lautenschlager, G. J. (2005, April). Sensitivity of DFIT Tests of Measurement Invariance for Likert Data. Paper presented at the 20 th Annual Conference of the Society for Industrial and
More informationThe Modification of Dichotomous and Polytomous Item Response Theory to Structural Equation Modeling Analysis
Canadian Social Science Vol. 8, No. 5, 2012, pp. 71-78 DOI:10.3968/j.css.1923669720120805.1148 ISSN 1712-8056[Print] ISSN 1923-6697[Online] www.cscanada.net www.cscanada.org The Modification of Dichotomous
More informationUsing the Distractor Categories of Multiple-Choice Items to Improve IRT Linking
Using the Distractor Categories of Multiple-Choice Items to Improve IRT Linking Jee Seon Kim University of Wisconsin, Madison Paper presented at 2006 NCME Annual Meeting San Francisco, CA Correspondence
More informationOptimal full matching for survival outcomes: a method that merits more widespread use
Research Article Received 3 November 2014, Accepted 6 July 2015 Published online 6 August 2015 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.6602 Optimal full matching for survival
More informationCitation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.
University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document
More informationPubH 7405: REGRESSION ANALYSIS. Propensity Score
PubH 7405: REGRESSION ANALYSIS Propensity Score INTRODUCTION: There is a growing interest in using observational (or nonrandomized) studies to estimate the effects of treatments on outcomes. In observational
More informationA Brief Introduction to Bayesian Statistics
A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon
More informationPLEASE SCROLL DOWN FOR ARTICLE. Full terms and conditions of use:
This article was downloaded by: [CDL Journals Account] On: 26 November 2009 Access details: Access Details: [subscription number 794379768] Publisher Psychology Press Informa Ltd Registered in England
More informationAnalysis Propensity Score with Structural Equation Model Partial Least Square
PROCEEDING OF 3 RD INTERNATIONAL CONFERENCE ON RESEARCH, IMPLEMENTATION AND EDUCATION OF MATHEMATICS AND SCIENCE YOGYAKARTA, 16 17 MAY 2016 Analysis Propensity Score with Structural Equation Model Partial
More informationG , G , G MHRN
Estimation of treatment-effects from randomized controlled trials in the presence non-compliance with randomized treatment allocation Graham Dunn University of Manchester Research funded by: MRC Methodology
More informationCombining machine learning and matching techniques to improve causal inference in program evaluation
bs_bs_banner Journal of Evaluation in Clinical Practice ISSN1365-2753 Combining machine learning and matching techniques to improve causal inference in program evaluation Ariel Linden DrPH 1,2 and Paul
More informationinvestigate. educate. inform.
investigate. educate. inform. Research Design What drives your research design? The battle between Qualitative and Quantitative is over Think before you leap What SHOULD drive your research design. Advanced
More informationThe Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models
University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Educational Psychology Papers and Publications Educational Psychology, Department of 7-1-2001 The Relative Performance of
More informationAdjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data
Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data Karl Bang Christensen National Institute of Occupational Health, Denmark Helene Feveille National
More informationLatent Variable Modeling - PUBH Latent variable measurement models and path analysis
Latent Variable Modeling - PUBH 7435 Improved Name: Latent variable measurement models and path analysis Slide 9:45 - :00 Tuesday and Thursday Fall 2006 Melanie M. Wall Division of Biostatistics School
More informationDoing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto
Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling Olli-Pekka Kauppila Daria Kautto Session VI, September 20 2017 Learning objectives 1. Get familiar with the basic idea
More informationRunning head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note
Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1 Nested Factor Analytic Model Comparison as a Means to Detect Aberrant Response Patterns John M. Clark III Pearson Author Note John M. Clark III,
More informationSection on Survey Research Methods JSM 2009
Missing Data and Complex Samples: The Impact of Listwise Deletion vs. Subpopulation Analysis on Statistical Bias and Hypothesis Test Results when Data are MCAR and MAR Bethany A. Bell, Jeffrey D. Kromrey
More informationJSM Survey Research Methods Section
Methods and Issues in Trimming Extreme Weights in Sample Surveys Frank Potter and Yuhong Zheng Mathematica Policy Research, P.O. Box 393, Princeton, NJ 08543 Abstract In survey sampling practice, unequal
More informationLec 02: Estimation & Hypothesis Testing in Animal Ecology
Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then
More informationCausal Methods for Observational Data Amanda Stevenson, University of Texas at Austin Population Research Center, Austin, TX
Causal Methods for Observational Data Amanda Stevenson, University of Texas at Austin Population Research Center, Austin, TX ABSTRACT Comparative effectiveness research often uses non-experimental observational
More informationPsychology, 2010, 1: doi: /psych Published Online August 2010 (
Psychology, 2010, 1: 194-198 doi:10.4236/psych.2010.13026 Published Online August 2010 (http://www.scirp.org/journal/psych) Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes
More informationOn Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015
On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses Structural Equation Modeling Lecture #12 April 29, 2015 PRE 906, SEM: On Test Scores #2--The Proper Use of Scores Today s Class:
More informationImpact of Violation of the Missing-at-Random Assumption on Full-Information Maximum Likelihood Method in Multidimensional Adaptive Testing
A peer-reviewed electronic journal. Copyright is retained by the first or sole author, who grants right of first publication to Practical Assessment, Research & Evaluation. Permission is granted to distribute
More informationDetection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis
Advanced Studies in Medical Sciences, Vol. 1, 2013, no. 3, 143-156 HIKARI Ltd, www.m-hikari.com Detection of Unknown Confounders by Bayesian Confirmatory Factor Analysis Emil Kupek Department of Public
More informationRunning head: CFA OF TDI AND STICSA 1. p Factor or Negative Emotionality? Joint CFA of Internalizing Symptomology
Running head: CFA OF TDI AND STICSA 1 p Factor or Negative Emotionality? Joint CFA of Internalizing Symptomology Caspi et al. (2014) reported that CFA results supported a general psychopathology factor,
More informationAdjustments for Rater Effects in
Adjustments for Rater Effects in Performance Assessment Walter M. Houston, Mark R. Raymond, and Joseph C. Svec American College Testing Alternative methods to correct for rater leniency/stringency effects
More informationRunning head: CFA OF STICSA 1. Model-Based Factor Reliability and Replicability of the STICSA
Running head: CFA OF STICSA 1 Model-Based Factor Reliability and Replicability of the STICSA The State-Trait Inventory of Cognitive and Somatic Anxiety (STICSA; Ree et al., 2008) is a new measure of anxiety
More informationAbstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION
A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION Fong Wang, Genentech Inc. Mary Lange, Immunex Corp. Abstract Many longitudinal studies and clinical trials are
More informationMultifactor Confirmatory Factor Analysis
Multifactor Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #9 March 13, 2013 PSYC 948: Lecture #9 Today s Class Confirmatory Factor Analysis with more than
More informationKnown-Groups Validity 2017 FSSE Measurement Invariance
Known-Groups Validity 2017 FSSE Measurement Invariance A key assumption of any latent measure (any questionnaire trying to assess an unobservable construct) is that it functions equally across all different
More informationDiscussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse
Journal of Official Statistics, Vol. 23, No. 4, 2007, pp. 455 461 Discussion Ralf T. Münnich 1 1. Variance Estimation in the Presence of Nonresponse Professor Bjørnstad addresses a new approach to an extremely
More informationInclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller
Inclusive strategy with CFA/MI 1 Running head: CFA AND MULTIPLE IMPUTATION Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and All Incomplete Variables Jin Eun Yoo, Brian French,
More informationABSTRACT. Professor Gregory R. Hancock, Department of Measurement, Statistics and Evaluation
ABSTRACT Title: FACTOR MIXTURE MODELS WITH ORDERED CATEGORICAL OUTCOMES: THE MATHEMATICAL RELATION TO MIXTURE ITEM RESPONSE THEORY MODELS AND A COMPARISON OF MAXIMUM LIKELIHOOD AND BAYESIAN MODEL PARAMETER
More informationComplier Average Causal Effect (CACE)
Complier Average Causal Effect (CACE) Booil Jo Stanford University Methodological Advancement Meeting Innovative Directions in Estimating Impact Office of Planning, Research & Evaluation Administration
More informationQuestionnaire Construct Validation in the International Civic and Citizenship Education Study
Questionnaire Construct Validation in the International Civic and Citizenship Education Study Wolfram Schulz, Australian Council for Educational Research, Email: schulz@acer.edu.au Abstract International
More informationSimultaneous Equation and Instrumental Variable Models for Sexiness and Power/Status
Simultaneous Equation and Instrumental Variable Models for Seiness and Power/Status We would like ideally to determine whether power is indeed sey, or whether seiness is powerful. We here describe the
More informationUnderstanding and Applying Multilevel Models in Maternal and Child Health Epidemiology and Public Health
Understanding and Applying Multilevel Models in Maternal and Child Health Epidemiology and Public Health Adam C. Carle, M.A., Ph.D. adam.carle@cchmc.org Division of Health Policy and Clinical Effectiveness
More informationA Comparison of Item Response Theory and Confirmatory Factor Analytic Methodologies for Establishing Measurement Equivalence/Invariance
10.1177/1094428104268027 ORGANIZATIONAL Meade, Lautenschlager RESEARCH / COMP ARISON METHODS OF IRT AND CFA A Comparison of Item Response Theory and Confirmatory Factor Analytic Methodologies for Establishing
More informationResearch Design. Beyond Randomized Control Trials. Jody Worley, Ph.D. College of Arts & Sciences Human Relations
Research Design Beyond Randomized Control Trials Jody Worley, Ph.D. College of Arts & Sciences Human Relations Introduction to the series Day 1: Nonrandomized Designs Day 2: Sampling Strategies Day 3:
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHODS FOR TRANSLATIONAL & CLINICAL RESEARCH PROPENSITY SCORE Confounding Definition: A situation in which the effect or association between an exposure (a predictor or risk factor) and
More informationCurrent Directions in Mediation Analysis David P. MacKinnon 1 and Amanda J. Fairchild 2
CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE Current Directions in Mediation Analysis David P. MacKinnon 1 and Amanda J. Fairchild 2 1 Arizona State University and 2 University of South Carolina ABSTRACT
More informationSupplement 2. Use of Directed Acyclic Graphs (DAGs)
Supplement 2. Use of Directed Acyclic Graphs (DAGs) Abstract This supplement describes how counterfactual theory is used to define causal effects and the conditions in which observed data can be used to
More informationTechnical Appendix: Methods and Results of Growth Mixture Modelling
s1 Technical Appendix: Methods and Results of Growth Mixture Modelling (Supplement to: Trajectories of change in depression severity during treatment with antidepressants) Rudolf Uher, Bengt Muthén, Daniel
More informationConstruct Validity of the MBTI in Management Development: A Test of Two Interpretations. Robert B. Kaiser & S. Bartholomew Craig
Construct Validity of the MBTI in Management Development: A Test of Two Interpretations Robert B. Kaiser & S. Bartholomew Craig Myers-Briggs Type Indicator (MBTI) Derived from an explicit theory Scales
More informationOverview of Perspectives on Causal Inference: Campbell and Rubin. Stephen G. West Arizona State University Freie Universität Berlin, Germany
Overview of Perspectives on Causal Inference: Campbell and Rubin Stephen G. West Arizona State University Freie Universität Berlin, Germany 1 Randomized Experiment (RE) Sir Ronald Fisher E(X Treatment
More informationConstruct Invariance of the Survey of Knowledge of Internet Risk and Internet Behavior Knowledge Scale
University of Connecticut DigitalCommons@UConn NERA Conference Proceedings 2010 Northeastern Educational Research Association (NERA) Annual Conference Fall 10-20-2010 Construct Invariance of the Survey
More informationUnit 1 Exploring and Understanding Data
Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile
More informationIntroduction to Meta-Analysis
Introduction to Meta-Analysis Nazım Ço galtay and Engin Karada g Abstract As a means to synthesize the results of multiple studies, the chronological development of the meta-analysis method was in parallel
More informationStrategies for handling missing data in randomised trials
Strategies for handling missing data in randomised trials NIHR statistical meeting London, 13th February 2012 Ian White MRC Biostatistics Unit, Cambridge, UK Plan 1. Why do missing data matter? 2. Popular
More informationDoctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, The Scientific Method of Problem Solving
Doctoral Dissertation Boot Camp Quantitative Methods Kamiar Kouzekanani, PhD January 27, 2018 The Scientific Method of Problem Solving The conceptual phase Reviewing the literature, stating the problem,
More informationIs Random Sampling Necessary? Dan Hedlin Department of Statistics, Stockholm University
Is Random Sampling Necessary? Dan Hedlin Department of Statistics, Stockholm University Focus on official statistics Trust is paramount (Holt 2008) Very wide group of users Official statistics is official
More informationApplications of Structural Equation Modeling (SEM) in Humanities and Science Researches
Applications of Structural Equation Modeling (SEM) in Humanities and Science Researches Dr. Ayed Al Muala Department of Marketing, Applied Science University aied_muala@yahoo.com Dr. Mamdouh AL Ziadat
More informationInvestigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories
Kamla-Raj 010 Int J Edu Sci, (): 107-113 (010) Investigating the Invariance of Person Parameter Estimates Based on Classical Test and Item Response Theories O.O. Adedoyin Department of Educational Foundations,
More informationInternational Journal of Education and Research Vol. 5 No. 5 May 2017
International Journal of Education and Research Vol. 5 No. 5 May 2017 EFFECT OF SAMPLE SIZE, ABILITY DISTRIBUTION AND TEST LENGTH ON DETECTION OF DIFFERENTIAL ITEM FUNCTIONING USING MANTEL-HAENSZEL STATISTIC
More informationNon-Normal Growth Mixture Modeling
Non-Normal Growth Mixture Modeling Bengt Muthén & Tihomir Asparouhov Mplus www.statmodel.com bmuthen@statmodel.com Presentation to PSMG May 6, 2014 Bengt Muthén Non-Normal Growth Mixture Modeling 1/ 50
More informationProof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén
Chapter 12 General and Specific Factors in Selection Modeling Bengt Muthén Abstract This chapter shows how analysis of data on selective subgroups can be used to draw inference to the full, unselected
More informationItem Parameter Recovery for the Two-Parameter Testlet Model with Different. Estimation Methods. Abstract
Item Parameter Recovery for the Two-Parameter Testlet Model with Different Estimation Methods Yong Luo National Center for Assessment in Saudi Arabia Abstract The testlet model is a popular statistical
More informationS Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.
S05-2008 Imputation of Categorical Missing Data: A comparison of Multivariate Normal and Abstract Multinomial Methods Holmes Finch Matt Margraf Ball State University Procedures for the imputation of missing
More informationASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT S MATHEMATICS ACHIEVEMENT IN MALAYSIA
1 International Journal of Advance Research, IJOAR.org Volume 1, Issue 2, MAY 2013, Online: ASSESSING THE UNIDIMENSIONALITY, RELIABILITY, VALIDITY AND FITNESS OF INFLUENTIAL FACTORS OF 8 TH GRADES STUDENT
More informationBias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study
STATISTICAL METHODS Epidemiology Biostatistics and Public Health - 2016, Volume 13, Number 1 Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation
More informationThe Multidimensionality of Revised Developmental Work Personality Scale
The Multidimensionality of Revised Developmental Work Personality Scale Work personality has been found to be vital in developing the foundation for effective vocational and career behavior (Bolton, 1992;
More informationComprehensive Statistical Analysis of a Mathematics Placement Test
Comprehensive Statistical Analysis of a Mathematics Placement Test Robert J. Hall Department of Educational Psychology Texas A&M University, USA (bobhall@tamu.edu) Eunju Jung Department of Educational
More informationLogistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values
Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values Sutthipong Meeyai School of Transportation Engineering, Suranaree University of Technology,
More informationAnalysis of the Reliability and Validity of an Edgenuity Algebra I Quiz
Analysis of the Reliability and Validity of an Edgenuity Algebra I Quiz This study presents the steps Edgenuity uses to evaluate the reliability and validity of its quizzes, topic tests, and cumulative
More informationA structural equation modeling approach for examining position effects in large scale assessments
DOI 10.1186/s40536-017-0042-x METHODOLOGY Open Access A structural equation modeling approach for examining position effects in large scale assessments Okan Bulut *, Qi Quo and Mark J. Gierl *Correspondence:
More informationUN Handbook Ch. 7 'Managing sources of non-sampling error': recommendations on response rates
JOINT EU/OECD WORKSHOP ON RECENT DEVELOPMENTS IN BUSINESS AND CONSUMER SURVEYS Methodological session II: Task Force & UN Handbook on conduct of surveys response rates, weighting and accuracy UN Handbook
More informationUsing directed acyclic graphs to guide analyses of neighbourhood health effects: an introduction
University of Michigan, Ann Arbor, Michigan, USA Correspondence to: Dr A V Diez Roux, Center for Social Epidemiology and Population Health, 3rd Floor SPH Tower, 109 Observatory St, Ann Arbor, MI 48109-2029,
More informationThe Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming
Curr Psychol (2009) 28:194 201 DOI 10.1007/s12144-009-9058-x The Psychometric Properties of Dispositional Flow Scale-2 in Internet Gaming C. K. John Wang & W. C. Liu & A. Khoo Published online: 27 May
More informationTHE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION
THE APPLICATION OF ORDINAL LOGISTIC HEIRARCHICAL LINEAR MODELING IN ITEM RESPONSE THEORY FOR THE PURPOSES OF DIFFERENTIAL ITEM FUNCTIONING DETECTION Timothy Olsen HLM II Dr. Gagne ABSTRACT Recent advances
More informationThe Impact of Relative Standards on the Propensity to Disclose. Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX
The Impact of Relative Standards on the Propensity to Disclose Alessandro Acquisti, Leslie K. John, George Loewenstein WEB APPENDIX 2 Web Appendix A: Panel data estimation approach As noted in the main
More informationEffects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton
Effects of propensity score overlap on the estimates of treatment effects Yating Zheng & Laura Stapleton Introduction Recent years have seen remarkable development in estimating average treatment effects
More information