Small Sample Bayesian Factor Analysis. PhUSE 2014 Paper SP03 Dirk Heerwegh

Similar documents
MS&E 226: Small Data

Confirmatory Factor Analysis of Preschool Child Behavior Checklist (CBCL) (1.5 5 yrs.) among Canadian children

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Ordinal Data Modeling

ST440/550: Applied Bayesian Statistics. (10) Frequentist Properties of Bayesian Methods

UvA-DARE (Digital Academic Repository)

Instrument equivalence across ethnic groups. Antonio Olmos (MHCD) Susan R. Hutchinson (UNC)

Running head: NESTED FACTOR ANALYTIC MODEL COMPARISON 1. John M. Clark III. Pearson. Author Note

Identifying or Verifying the Number of Factors to Extract using Very Simple Structure.

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis

Alternative Methods for Assessing the Fit of Structural Equation Models in Developmental Research

Impact of an equality constraint on the class-specific residual variances in regression mixtures: A Monte Carlo simulation study

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

accuracy (see, e.g., Mislevy & Stocking, 1989; Qualls & Ansley, 1985; Yen, 1987). A general finding of this research is that MML and Bayesian

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Paul Irwing, Manchester Business School

The Relative Performance of Full Information Maximum Likelihood Estimation for Missing Data in Structural Equation Models

Michael Hallquist, Thomas M. Olino, Paul A. Pilkonis University of Pittsburgh

Scale Building with Confirmatory Factor Analysis

Introduction to Survival Analysis Procedures (Chapter)

What is Regularization? Example by Sean Owen

Multilevel IRT for group-level diagnosis. Chanho Park Daniel M. Bolt. University of Wisconsin-Madison

The matching effect of intra-class correlation (ICC) on the estimation of contextual effect: A Bayesian approach of multilevel modeling

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Bayesian Estimation of a Meta-analysis model using Gibbs sampler

BAYESIAN ESTIMATORS OF THE LOCATION PARAMETER OF THE NORMAL DISTRIBUTION WITH UNKNOWN VARIANCE

A critical look at the use of SEM in international business research

Mostly Harmless Simulations? On the Internal Validity of Empirical Monte Carlo Studies

Multivariate Regression with Small Samples: A Comparison of Estimation Methods W. Holmes Finch Maria E. Hernández Finch Ball State University

A Comparison of Factor Rotation Methods for Dichotomous Data

In this module I provide a few illustrations of options within lavaan for handling various situations.

Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller

MEA DISCUSSION PAPERS

Measurement Error in Nonlinear Models

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials.

On the Performance of Maximum Likelihood Versus Means and Variance Adjusted Weighted Least Squares Estimation in CFA

Title: The Theory of Planned Behavior (TPB) and Texting While Driving Behavior in College Students MS # Manuscript ID GCPI

COMPARING PLS TO REGRESSION AND LISREL: A RESPONSE TO MARCOULIDES, CHIN, AND SAUNDERS 1

Ecological Statistics

Doing Quantitative Research 26E02900, 6 ECTS Lecture 6: Structural Equations Modeling. Olli-Pekka Kauppila Daria Kautto

Proof. Revised. Chapter 12 General and Specific Factors in Selection Modeling Introduction. Bengt Muthén

Impact and adjustment of selection bias. in the assessment of measurement equivalence

To link to this article:

Understandable Statistics

Meta-Analysis and Publication Bias: How Well Does the FAT-PET-PEESE Procedure Work?

Bayes, Data and NUREG/CR-6928 Caveats Nathan Larson, Carroll Trull September 2017

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

Bayesian Mediation Analysis

Abstract. Introduction A SIMULATION STUDY OF ESTIMATORS FOR RATES OF CHANGES IN LONGITUDINAL STUDIES WITH ATTRITION

EDITOR S COMMENTS. PLS: A Silver Bullet?

Method Comparison for Interrater Reliability of an Image Processing Technique in Epilepsy Subjects

Estimating individual treatment effects from multiple-baseline data: A Monte Carlo study of multilevel-modeling approaches

Score Tests of Normality in Bivariate Probit Models

Bayesian Tailored Testing and the Influence

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design

BAYESIAN HYPOTHESIS TESTING WITH SPSS AMOS

A SAS Macro to Investigate Statistical Power in Meta-analysis Jin Liu, Fan Pan University of South Carolina Columbia

Running Head: BAYESIAN MEDIATION WITH MISSING DATA 1. A Bayesian Approach for Estimating Mediation Effects with Missing Data. Craig K.

Bayes Linear Statistics. Theory and Methods

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

Multifactor Confirmatory Factor Analysis

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling

Linear Regression in SAS

On Test Scores (Part 2) How to Properly Use Test Scores in Secondary Analyses. Structural Equation Modeling Lecture #12 April 29, 2015

Accuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study

Manifestation Of Differences In Item-Level Characteristics In Scale-Level Measurement Invariance Tests Of Multi-Group Confirmatory Factor Analyses

The Effect of Guessing on Assessing Dimensionality in Multiple-Choice Tests: A Monte Carlo Study with Application. Chien-Chi Yeh

SLAUGHTER PIG MARKETING MANAGEMENT: UTILIZATION OF HIGHLY BIASED HERD SPECIFIC DATA. Henrik Kure

Variation in Measurement Error in Asymmetry Studies: A New Model, Simulations and Application

Supervised Learner for the Prediction of Hi-C Interaction Counts and Determination of Influential Features. Tyler Yue Lab

UNIVERSITY OF FLORIDA 2010

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

CSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

You must answer question 1.

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Item Parameter Recovery for the Two-Parameter Testlet Model with Different. Estimation Methods. Abstract

Response to Comment on Cognitive Science in the field: Does exercising core mathematical concepts improve school readiness?

Bayesian integration in sensorimotor learning

Center for Advanced Studies in Measurement and Assessment. CASMA Research Report

ABSTRACT. Professor Gregory R. Hancock, Department of Measurement, Statistics and Evaluation

Study Registration for the KPU Study Registry

Bayesian hierarchical modelling

Department of Educational Administration, Allameh Tabatabaei University, Tehran, Iran.

S Imputation of Categorical Missing Data: A comparison of Multivariate Normal and. Multinomial Methods. Holmes Finch.

Anale. Seria Informatică. Vol. XVI fasc Annals. Computer Science Series. 16 th Tome 1 st Fasc. 2018

A Structural Equation Modeling: An Alternate Technique in Predicting Medical Appointment Adherence

CISC453 Winter Probabilistic Reasoning Part B: AIMA3e Ch

Sampling Weights, Model Misspecification and Informative Sampling: A Simulation Study

Structural Equation Modeling (SEM)

Using Bayesian Networks to Direct Stochastic Search in Inductive Logic Programming

Running head: STATISTICAL AND SUBSTANTIVE CHECKING

Propensity scores: what, why and why not?

Exploratory and Confirmatory Factor Analysis: Developing the Purpose in Life test-short

Statistical Tolerance Regions: Theory, Applications and Computation

What can the Real World do for simulation studies? A comparison of exploratory methods

How few countries will do? Comparative survey analysis from a Bayesian perspective

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Examining Relationships Least-squares regression. Sections 2.3

Transcription:

Small Sample Bayesian Factor Analysis PhUSE 2014 Paper SP03 Dirk Heerwegh

Overview Factor analysis Maximum likelihood Bayes Simulation Studies Design Results Conclusions

Factor Analysis (FA) Explain correlation between observed variables based on common causes (factors) Exploratory vs. Confirmatory FA

Factor Analysis (FA) Maximum likelihood estimation (MLE) Large sample technique (e.g. N>200) If N is too low: Model non-convergence Negative residuals (Heywood cases) Other problems? (statistical power, accuracy of estimates)

Bayesian Statistics Does not rely on large sample theory Better performance in small samples expected But prior distribution more influential in smaller samples Incorporation of prior knowledge Before seeing the data, based on theory, previous studies Captured in prior distribution

Simulation studies General research questions How does Bayesian CFA compare to ML CFA at moderate to very small sample sizes? What is the influence of priors in Bayesian CFA? Method Non-informative versus informative priors Correctly versus incorrectly specified priors Monte Carlo simulations in Mplus 7.0

Simulation study 1 1-factor CFA model with 4 observed variables 48 conditions in the simulation study: Sample sizes: 200, 100, 50, 25 Standardized factor loadings:.80,.60,.40 ML or BAYES estimator If BAYES: Prior distribution for factor loadings: Non-informative (mean = 0, variance = infinity) Informative: mean = 0.40 and variance =.05 or.01 Note that the prior is miss specified for the conditions in which the true factor loadings are.80 and.60

TITLE: BAYES CFA Monte Carlo Simulation study for Phuse! SIMULATION PARAMETERS! ---------------------! Condition 3! Estimator: BAYES! Sample size: 200! Lambda:.8! Residual variance: 0.36! Prior for the lambdas:! Normal distribution with mean = 0.4 and variance = 0.05 MONTECARLO: NAMES ARE x1-x4; NOBSERVATIONS = 200; NREPS=1000; SEED=12345; MODEL POPULATION: f1 BY x1-x4*.8; x1-x4*0.36; f1@1; ANALYSIS: estimator = bayes; proc = 2; fbiter = 10000; MODEL: f1 BY x1-x4*.8 (a1-a4); x1-x4*0.36; f1@1; model priors: a1-a4 ~ N(0.4,0.05); OUTPUT: TECH9;

Simulation study 1 Criteria: Percentage parameter bias: 100 x (mean estimated value popul. value) / popul. value Averaged over the 4 observed variables in the simulation study. MSE (Mean Squared Error) MSE is equal to the variance of the estimates across the replications plus the square of the bias. Averaged over the 4 observed variables in the simulation study. Statistical power to detect significant factor loading Non-convergence of the models or errors in the model estimation (residual covariance matrix not positive definite, untrustworthy standard errors).

Results: % parameter bias

Results: MSE

Results: Power

Results on error-free runs

Conclusions study 1 Main driving force: factor loadings ML CFA performs well at high factor loadings Few model convergence errors, high power, low bias As factor loadings become smaller, Bayesian CFA becomes more appealing Fewer model convergence problems, less biased estimates (noninformative or correctly specified informative priors) But power is slightly lower than with MLE if using non-informative priors. Miss-specified priors bias the results as we would expect

Simulation study 2 2-factor CFA model with 3 observed variables each and no cross-loadings ( simple structure ) 96 conditions in the simulation study: Sample sizes: 200, 100, 50, 25 Standardized factor loadings:.8,.6,.4 Factor correlations of.25 and.40 ML or BAYES estimator If BAYES: Prior distribution for factor loadings: Non-informative (mean = 0, variance = infinity) Informative: mean = 0.4 and variance =.05 or.01 Note that the prior is miss specified for the conditions in which the true factor loadings are.8 and.6 If BAYES: non-informative prior on factor covariance Method: Monte Carlo simulations in Mplus 7

Results See paper for full description Focus here on bias on factor covariance Recall: non-informative prior used for factor covariance

Results: % bias on factor covariance

Results: % bias on factor covariance

Conclusions study 2 (selection) Miss specified priors on factor loadings lead to bias in estimated factor covariance In a CFA model, all parameters work together in a system Introducting an error in one part of the model will influence other parts of the model Important if you re working with complex models!

In conclusion Bayesian CFA certainly delivers on some aspects: It can be used in (very) small samples / measurement models with low factor loadings, where ML CFA either cannot estimate model parameters or does so with bias But it is important to judiciously select the prior. This should be done based on theory, but keep in mind the possibility of spill-over effects! Further (simulation) studies are needed to gain a fuller understanding of Bayesian CFA models as well as Bayesian SEM models

In conclusion Additional appeals of Bayesian CFA: Near-zero cross-loadings and/or error covariances Not possible in ML CFA because of model non-identification Benefits: May be more in line with theory, improves model fit, can aid in recovering a theorized simple structure See reference in paper to a study on the Hospital Anxiety and Depression Scale.