THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES. Bethany C. Bray, Ph.D.

Similar documents
UNLOCKING VALUE WITH DATA SCIENCE BAYES APPROACH: MAKING DATA WORK HARDER

GENDER, RACE/ETHNICITY, SEXUAL ORIENTATION AND STI/HIV RELATED RISK AMONG YOUNG U.S. ADULTS 1

The Methodology Center ADVANCING METHODS, IMPROVING HEALTH ANNUAL REPORT

Nonresponse Rates and Nonresponse Bias In Household Surveys

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Addictive Behaviors 34 (2009) Contents lists available at ScienceDirect. Addictive Behaviors

Data harmonization tutorial:teaser for FH2019

You must answer question 1.

Module 14: Missing Data Concepts

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

ABSTRACT. and depressive symptoms compared with heterosexual youth. It has been suggested that

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016

Does Skipping or Repeating a Grade Affect Self-Assessed Intelligence? Timothy Smith. Georgia College and State University

La Follette School of Public Affairs

MS&E 226: Small Data

The Effects of Maternal Alcohol Use and Smoking on Children s Mental Health: Evidence from the National Longitudinal Survey of Children and Youth

Donna L. Coffman Joint Prevention Methodology Seminar

Seeing the Forest & Trees: novel analyses of complex interventions to guide health system decision making

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Running head: MENTAL HEALTH AND RELATIONSHIP PROGRESSION. The Association between Mental Health and Relationship Progression. Sara E.

P E R S P E C T I V E S

The University of North Carolina at Chapel Hill School of Social Work

Statistical Hocus Pocus? Assessing the Accuracy of a Diagnostic Screening Test When You Don t Even Know Who Has the Disease

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

Analysis of TB prevalence surveys

Impact and adjustment of selection bias. in the assessment of measurement equivalence

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

Carrying out an Empirical Project

Lecture Outline Biost 517 Applied Biostatistics I

Probabilistic Approach to Estimate the Risk of Being a Cybercrime Victim

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

On the Targets of Latent Variable Model Estimation

Transitions in Depressive Symptoms After 10 Years of Follow-up Using PROC LTA in SAS and Mplus

Great Expectations: Changing Mode of Survey Data Collection in Military Populations

Indirect Treatment Comparison Without Network Meta-Analysis: Overview of Novel Techniques

Smart BJA Initiatives and the Role of the Research Partnership

Discussion. Ralf T. Münnich Variance Estimation in the Presence of Nonresponse

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Detailing the Associations between Nondrinkers and Mortality

Field 052: Social Studies Psychology Assessment Blueprint

Introducing Meta-Analysis. and Meta-Stat. This chapter introduces meta-analysis and the Meta-Stat program.

A Brief Introduction to Bayesian Statistics

Econometric analysis and counterfactual studies in the context of IA practices

Introduction to Meta-analysis of Accuracy Data

Deanna Schreiber-Gregory Henry M Jackson Foundation for the Advancement of Military Medicine. PharmaSUG 2016 Paper #SP07

Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy

Transitions in Depressive Symptoms After 10 Years of Follow-up Using PROC LTA

Lowest-Low Fertility and Governmental Actions in Japan: A Comment

Chapter 1: Exploring Data

Biobehavioral Health Advisor: Stephanie Lanza, Ph.D.

Society for Prevention Research 2018 Call for Papers Author Abstract Submission Questions. Research Foci

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

SMART Clinical Trial Designs for Dynamic Treatment Regimes

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

Measuring and Assessing Study Quality

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Chapter 14: More Powerful Statistical Methods

CLINICAL TRIALS TRAINING PROGRAMS for COMMUNITY LEADERS, HEALTH CARE PROVIDERS and CLINICAL TRIAL TEAMS

NBER WORKING PAPER SERIES PEER EFFECTS AND MULTIPLE EQUILIBRIA IN THE RISKY BEHAVIOR OF FRIENDS. David Card Laura Giuliano

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

Structural Equation Modeling (SEM)

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

HIV risk associated with injection drug use in Houston, Texas 2009: A Latent Class Analysis

Introduction to Program Evaluation

Sensory Cue Integration

The Pennsylvania State University The Graduate School EXAMINING GAMBLING AND SUBSTANCE USE: APPLICATIONS OF ADVANCED LATENT CLASS MODELING

Alternative indicators for the risk of non-response bias

SJSU Annual Program Assessment Form Academic Year

Moving beyond regression toward causality:

Predicting Group-Level Outcome Variables: An Empirical Comparison of Analysis Strategies

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Technical Considerations: the past, present and future of simulation modeling of colorectal cancer

Small-area estimation of mental illness prevalence for schools

Using natural experiments to evaluate population health interventions

EC352 Econometric Methods: Week 07

Lessons Learned from Flint, MI: Implications for our Community

McLean Dickinson Pollock

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Study Guide #2: MULTIPLE REGRESSION in education

Matters of Substance SSSP Annual Meeting. Program Theme: Service Sociology

Welcoming English Language Learners into FSL Programs. February 21, 2017

Panel: Using Structural Equation Modeling (SEM) Using Partial Least Squares (SmartPLS)

Blending Psychometrics with Bayesian Inference Networks: Measuring Hundreds of Latent Variables Simultaneously

Accelerating Patient-Centered Outcomes Research and Methodological Research

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

Regulatory interactions: Expectations on extrapolation approaches

NIH NEW CLINICAL TRIAL REQUIREMENTS AND FORMS-E

Pharmacotherapy for Alcohol Dependence

Design for Targeted Therapies: Statistical Considerations

SSS 528 Human Sexuality 1 Credit Fall, 2011 Instructor: Dorothy Van Dam, LICSW

Adapting the BRFSS to Survey Deaf Sign Language Users

Analysis methods for improved external validity

Meta-analysis of safety thoughts from CIOMS X

How should the propensity score be estimated when some confounders are partially observed?

16:35 17:20 Alexander Luedtke (Fred Hutchinson Cancer Research Center)

Fixed Effect Combining

Missing Data and Imputation

Instrumental Variables Estimation: An Introduction

Transcription:

THE GOOD, THE BAD, & THE UGLY: WHAT WE KNOW TODAY ABOUT LCA WITH DISTAL OUTCOMES Bethany C. Bray, Ph.D. bcbray@psu.edu

WHAT ARE WE HERE TO TALK ABOUT TODAY? Behavioral scientists increasingly are using latent class analysis (LCA) to identify subgroups of individuals based on unique patterns of Behavior Risk exposure Mental health symptoms Other characteristics LCA is a powerful and intuitive tool for studying heterogeneity in these characteristics

WHAT ARE WE HERE TO TALK ABOUT TODAY? But, new methods are needed to address the next generation of complex questions How is subgroup membership embedded in developmental pathways? For example, how subgroup membership is linked to later outcomes Do patterns of early risk exposure during childhood predict later binge drinking during adolescence? Do patterns of depression symptoms during adolescence predict later academic achievement?

WHAT ARE WE HERE TO TALK ABOUT TODAY? LCA with a distal outcome poses interesting methodological challenges Literature has included a rapidly increasing number of publications proposing competing approaches to address these challenges Summarize three state-of-the-art approaches to LCA with distal outcomes Focus on the simplest case of a latent class predictor and an observed distal outcome

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

WHY LCA? Statistical tool that behavioral scientists are turning to with increasing frequency Can be used to explain population heterogeneity by identifying underlying subgroups of individuals Subgroups (classes) are comprised of individuals who are similar in their responses to a set of observed variables Class membership is inferred from responses to the observed variables

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

A (VERY) BRIEF INTRODUCTION LCA posits mutually exclusive and exhaustive underlying set of latent classes Classes and class membership inferred from multiple categorical observed variables In traditional model, interested in two sets of parameters Rhos: item-response probabilities Gammas: latent class membership probabilities

A (VERY) BRIEF INTRODUCTION C X 1 X 2 X J

A (VERY) BRIEF INTRODUCTION C item-response probabilities X 1 X 2 X J

A (VERY) BRIEF INTRODUCTION latent class membership probabilities C item-response probabilities X 1 X 2 X J

A (VERY) BRIEF INTRODUCTION Often interested in understanding what predicts latent class membership What are the important predictors of patterns of early risk exposure during childhood? What are the important predictors of patterns of depression symptoms during adolescence? More concrete: do adolescents friendship goals (i.e., a risk factor) predict substance use patterns (i.e., a latent class variable)?

A (VERY) BRIEF INTRODUCTION Mathematical model for predicting class membership from a covariate is wellunderstood Estimating the association between a latent class predictor and distal outcome presents a more difficult methodological problem Solving this problem is a hot topic in the methodological literature right now Three competing state-of-the-art approaches to LCA with distal outcomes

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

WHY LCA WITH DISTAL OUTCOMES? What if we are interested in Predicting later binge drinking from early risk exposure Predicting later academic achievement from depression subtypes One modeling option is to use Latent class analysis (LCA) with a distal outcome Latent class variable is risk exposure Distal outcome is observed binge drinking

WHY LCA WITH DISTAL OUTCOMES? When using latent class membership to predict a distal outcome, interested in effect of C on Y Let s think about this graphically

WHY LCA WITH DISTAL OUTCOMES? C X 1 X 2 X J

WHY LCA WITH DISTAL OUTCOMES? C // Y X 1 X 2 X J

WHY LCA WITH DISTAL OUTCOMES? latent class membership probabilities C // Y item-response probabilities X 1 X 2 X J

WHY LCA WITH DISTAL OUTCOMES? latent class membership probabilities C effect of C on // Y Y item-response probabilities X 1 X 2 X J

WHY LCA WITH DISTAL OUTCOMES? For example, what is the effect of risk exposure latent class membership on binge drinking? Classes of individuals Low Risk Peer Risk Economic Risk Household & Peer Risk Multi-Risk

WHY LCA WITH DISTAL OUTCOMES? For example, what is the effect of risk exposure latent class membership on binge drinking? Classes of LOW RISK and HIGH RISK individuals Does prevalence of BINGE DRINKING differ between low risk and high risk individuals To address this question, we need to know the conditional distribution of Y given C That is, the probabilities of binge drinking for both low risk and high risk individuals

WHY LCA WITH DISTAL OUTCOMES? C

WHY LCA WITH DISTAL OUTCOMES? Latent Class 1 = Low Risk Latent Class 2 = High Risk

WHY LCA WITH DISTAL OUTCOMES? distribution of binge drinking (e.g., proportion) Latent Class 1 = Low Risk Latent Class 2 = High Risk distribution of binge drinking (e.g., proportion)

WHY ARE DISTAL OUTCOMES SO TROUBLESOME? LCA with a distal outcome poses interesting methodological challenges But, why?

WHY ARE DISTAL OUTCOMES SO TROUBLESOME? C // Y X 1 X 2 X J

WHY ARE DISTAL OUTCOMES SO TROUBLESOME? C // Y Now what does it look like? X 1 X 2 X J Y

WHY ARE DISTAL OUTCOMES SO TROUBLESOME? How do we get Y C?

WHAT IS THE SOLUTION? Historically, classify-analyze strategies have been used to solve this problem Individuals are assigned to classes using some rule based on posterior probabilities Then an outcome analysis is performed treating class membership as known For example, regressing the outcome on a set of dummy coded predictors for class assignment This approach, however, is known to cause substantial attenuation in effect estimates

WHAT IS THE SOLUTION? Instead, there are three general categories of state-of-the-art approaches to LCA with distal outcomes Each has been shown to work well under certain conditions in simulation studies How do scientists make informed decisions about which to choose?

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

TRADITIONAL APPROACHES These approaches are based on a simple idea Finding Y C is difficult because C is unobserved So, make C observed and then find Y C These approaches are often called classify-analyze or 3-step approaches

TRADITIONAL APPROACHES All classify-analyze approaches rely on posterior probabilities Each individual has posterior probability of membership for latent class: P(C = c Y = y) Use posterior probabilities as basis for classify-analyze approaches Classification step: Classify individuals to latent classes based on probabilities Analysis step: Treat latent class membership as known in analysis model

TRADITIONAL APPROACH #1 Modal or maximum-probability assignment Fit and compare competing LCAs to select optimal model Calculate posterior probabilities for each individual, for each latent class Assign individuals to latent class with highest posterior probability Conduct analysis by regressing distal outcome on latent class membership

TRADITIONAL APPROACH #2 Proportional assignment Like modal assignment But, partially assign individuals based on their posterior probability distributions Conduct analysis by regressing distal outcome on latent class membership

TRADITIONAL APPROACH #3 Multiple pseudo-class draws Select optimal model Calculate posterior probabilities Assign individuals based on distribution of posterior probabilities Conduct analysis Repeat steps 3 & 4 multiple (e.g., 20) times Combine results using rules from multiple imputation

TRADITIONAL APPROACHES The regression model gives us Y C Again, this conditional distribution is what we care about Does the distribution of BINGE DRINKING differ across latent classes But, numerous simulation studies have shown that these approaches severely attenuate the estimated relation between C and Y

HOW BAD IS BAD? Effect Size Max-Prob Non-Inclusive Max-Prob Inclusive Pseudo-class Non-Inclusive Pseudo-class Inclusive Large -.156.041 -.191.001 Medium -.083.028 -.103.006 Small -.031.009 -.039.001 No effect.000.000.000.000

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

CONTEMPORARY APPROACHES Contemporary approaches to LCA with distal outcomes are grouped into three main types Weighting by classification error Bayes theorem based approach Posterior probability improvement

CONTEMPORARY APPROACHES Each approach has been shown to work well under certain conditions in recently published simulation studies However, there has been no comprehensive overview summarizing the approaches and their assumptions or integration of take-home messages across simulation studies To further complicate matters, not all approaches are implemented in all LCA software packages and the availability of high-quality standard errors depends on the combination of approach and software package selected

WEIGHTING BY CLASSIFICATION ERROR Assign individuals to classes based on responses to indicators only Assignment typically uses modal or proportional assignment Retain information about the classification error rate Treat assignments as known in a subsequent analysis model weighted by the classification error rate Bakk, Z., & Vermunt, J. K. (in press). Robustness of stepwise latent class modeling with continuous distal outcomes. Structural Equation Modeling: A Multidisciplinary Journal.

BAYES THEOREM BASED APPROACH Fit latent class model and include distal outcome as a covariate Use Bayes theorem to reverse the direction of the effect, empirically derive distribution of the distal outcome given class membership Distribution based directly on the observed data or using a smoothed or un-smoothed kernel density estimator Lanza, S. T., Tan, X., & Bray, B. C. (2013). Latent class analysis with distal outcomes: A flexible model-based approach. Structural Equation Modeling: A Multidisciplinary Journal, 20(1), 1-26.

POSTERIOR PROBABILITY IMPROVEMENT Assign individuals to classes based on responses to indicators and distal outcome Assignment typically uses modal assignment or multiple pseudo-class draws Then treat assignments as known in subsequent analysis Bray, B. C., Lanza. S. T., & Tan, X. (2015). Eliminating bias in classify-analyze approaches for latent class analysis. Structural Equation Modeling: A Multidisciplinary Journal, 22(1), 1-11.

HOW GOOD IS GOOD? Effect Size Max-Prob Non-Inclusive Max-Prob Inclusive Pseudo-class Non-Inclusive Pseudo-class Inclusive Large -.156.041 -.191.001 Medium -.083.028 -.103.006 Small -.031.009 -.039.001 No effect.000.000.000.000

WHAT CAN THIS LOOK LIKE IN REAL LIFE? Approach Low Risk Peer Risk Economic Risk H.Hold & Peer Risk Multi-Risk Max-Prob Non- Inc.16.39.18.38.44 Pseudo-class Non-Inc.16.37.17.39.41 Max-Prob Inclusive.11.42.12.60.36 Pseudo-class Inclusive.11.41.12.62.36

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

THE GOOD

THE GOOD All contemporary approaches substantially reduce attenuation and can result in unbiased estimates Software of some kind is available for all contemporary approaches Latent Gold Mplus SAS

THE BAD

THE BAD All approaches DO NOT work with all types of outcomes No single software package accommodates all of the approaches All approaches seem to be highly sensitive to violations of model assumptions Standard errors for all approaches need work But, bootstrapping seems most promising

THE UGLY

PROS TO APPROACH APPROACH Reduces bias in estimates High-quality std. errs. readily available Can implement in any LCA software Msrmnt model does not change across analyses Does not require assigning individuals Robust to violations of analysis model assumptions Complexity of analysis model unlimited Traditional No No No #1: Weighting No No Latent Gold + Mplus No No #2: Bayes No No SAS + Mplus No No but can be improved No #3: Post Probs No No No No but can be improved

OUTLINE OF TODAY S TALK Why latent class analysis (LCA)? A brief introduction to LCA Why are distal outcomes so troublesome? Traditional approaches to LCA with distal outcomes Contemporary approaches to LCA with distal outcomes How are researchers supposed to choose? The good, the bad, and the ugly Take-home messages

TAKE-HOME MESSAGES 1. LCA with distal outcomes is a hot topic right now 2. There is a universal best approach but, can be difficult to implement 3. Assumptions and standard errors are tricky things

ACKNOWLEDGEMENTS Collaborators John Dziak Stephanie Lanza Michael Russell Xianming Tan Jie-Ting Zhang

ACKNOWLEDGEMENTS Funding The project described was supported by Award Number P50 DA010075 from the National Institute on Drug Abuse. Content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute on Drug Abuse or the National Institutes of Health. Data The project described used data from Add Health, a program project directed by Kathleen Mullan Harris and designed by J. Richard Udry, Peter S. Bearman, and Kathleen Mullan Harris at the University of North Carolina at Chapel Hill, and funded by grant P01-HD31921 from the Eunice Kennedy Shriver National Institute of Child Health and Human Development, with cooperative funding from 23 other federal agencies and foundations. Special acknowledgment is due Ronald R. Rindfuss and Barbara Entwisle for assistance in the original design. Information on how to obtain the Add Health data files is available on the Add Health website (www.cpc.unc.edu/addhealth). No direct support was received from grant P01-HD31921 for this analysis.

POST-DOC OPPORTUNITIES Prevention and Methodology Training (PAMT) Integrating prevention science and innovative methodology NIDA-funded T32 The Methodology Center Department of Biobehavioral Health, Penn State Working with Stephanie Lanza Institute of Social Research, University of Michigan Working with Megan Patrick

THANK YOU!! Bethany C. Bray, Ph.D. Research Assistant Professor Outreach Director, The Methodology Center Training Director, Prevention and Methodology Training (PAMT) Program bcbray@psu.edu bethanycbray.wordpress.com methodology.psu.edu/people/bbray The Methodology Center 204 East Calder Way, Suite 400 State College, PA 16801 mchelpdesk@psu.edu methodology.psu.edu