George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

Similar documents
Objective: To describe a new approach to neighborhood effects studies based on residential mobility and demonstrate this approach in the context of

Regression Discontinuity Designs: An Approach to Causal Inference Using Observational Data

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis

Modelling Double-Moderated-Mediation & Confounder Effects Using Bayesian Statistics

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

A NEW TRIAL DESIGN FULLY INTEGRATING BIOMARKER INFORMATION FOR THE EVALUATION OF TREATMENT-EFFECT MECHANISMS IN PERSONALISED MEDICINE

MS&E 226: Small Data

Current Directions in Mediation Analysis David P. MacKinnon 1 and Amanda J. Fairchild 2

Assessing the impact of unmeasured confounding: confounding functions for causal inference

Recent advances in non-experimental comparison group designs

Identification of population average treatment effects using nonlinear instrumental variables estimators : another cautionary note

Missing Data and Imputation

Identifying Mechanisms behind Policy Interventions via Causal Mediation Analysis

How should the propensity score be estimated when some confounders are partially observed?

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Mediation Analysis With Principal Stratification

Estimands. EFPIA webinar Rob Hemmings, Frank Bretz October 2017

In this module I provide a few illustrations of options within lavaan for handling various situations.

Sensitivity Analysis in Observational Research: Introducing the E-value

You must answer question 1.

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002

Data harmonization tutorial:teaser for FH2019

Introduction to Multilevel Models for Longitudinal and Repeated Measures Data

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Fundamental Clinical Trial Design

Module 14: Missing Data Concepts

Trial designs fully integrating biomarker information for the evaluation of treatment-effect mechanisms in stratified medicine

HPS301 Exam Notes- Contents

2.75: 84% 2.5: 80% 2.25: 78% 2: 74% 1.75: 70% 1.5: 66% 1.25: 64% 1.0: 60% 0.5: 50% 0.25: 25% 0: 0%

Complier Average Causal Effect (CACE)

Causal Mediation Analysis with the CAUSALMED Procedure

Studying the effect of change on change : a different viewpoint

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Causal mediation analysis of observational, population-based cancer survival data

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

16:35 17:20 Alexander Luedtke (Fred Hutchinson Cancer Research Center)

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

University of California, Berkeley

Structural Equation Modeling (SEM)

Instrumental Variables Estimation: An Introduction

Ec331: Research in Applied Economics Spring term, Panel Data: brief outlines

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

On the Targets of Latent Variable Model Estimation

Internal Validity and Experimental Design

Kelvin Chan Feb 10, 2015

Donna L. Coffman Joint Prevention Methodology Seminar

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Moving beyond regression toward causality:

Unit 1 Exploring and Understanding Data

Mediation analysis a tool to move from estimating treatment effect to understanding treatment mechanism

Bayesian and Frequentist Approaches

Estimating drug effects in the presence of placebo response: Causal inference using growth mixture modeling

Logistic regression: Why we often can do what we think we can do 1.

How to analyze correlated and longitudinal data?

P E R S P E C T I V E S

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Methods for Addressing Selection Bias in Observational Studies

Political Science 15, Winter 2014 Final Review

MEA DISCUSSION PAPERS

Generalized Estimating Equations for Depression Dose Regimes

What is Multilevel Modelling Vs Fixed Effects. Will Cook Social Statistics

An Introduction to Multiple Imputation for Missing Items in Complex Surveys

DRAFT (Final) Concept Paper On choosing appropriate estimands and defining sensitivity analyses in confirmatory clinical trials

Methods to control for confounding - Introduction & Overview - Nicolle M Gatto 18 February 2015

The Effects of Autocorrelated Noise and Biased HRF in fmri Analysis Error Rates

Help! Statistics! Missing data. An introduction

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Identifying Susceptibility in Epidemiology Studies: Implications for Risk Assessment. Joel Schwartz Harvard TH Chan School of Public Health

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Dan Byrd UC Office of the President

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

The Pretest! Pretest! Pretest! Assignment (Example 2)

Lecture 4: Research Approaches

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

investigate. educate. inform.

Class 7 Everything is Related

EPSE 594: Meta-Analysis: Quantitative Research Synthesis

NORTH SOUTH UNIVERSITY TUTORIAL 2

Multilevel analysis quantifies variation in the experimental effect while optimizing power and preventing false positives

Author's response to reviews

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Measures of Dispersion. Range. Variance. Standard deviation. Measures of Relationship. Range. Variance. Standard deviation.

Exploring the Influence of Particle Filter Parameters on Order Effects in Causal Learning

Examining Relationships Least-squares regression. Sections 2.3

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Meta-analysis using individual participant data: one-stage and two-stage approaches, and why they may differ

Quantitative Methods. Lonnie Berger. Research Training Policy Practice

Package mediation. September 18, 2009

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

School of Population and Public Health SPPH 503 Epidemiologic methods II January to April 2019

Exploring the Impact of Missing Data in Multiple Regression

COMMITTEE FOR PROPRIETARY MEDICINAL PRODUCTS (CPMP) POINTS TO CONSIDER ON MISSING DATA

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Transcription:

George B. Ploubidis The role of sensitivity analysis in the estimation of causal pathways from observational data Improving health worldwide www.lshtm.ac.uk

Outline Sensitivity analysis Causal Mediation Two examples Advantages Limitations An empirical example Summary

Causal inference Causal inference with observational data is a nearly alchemic task Estimates depend on the model being correctly specified no unmeasured confounders Sequential Ignorability Can t be directly tested Things become more complicated when mediation is of interest

A simple idea Sensitivity analysis is an effective method for probing the plausibility of a nonrefutable assumption (sequential ignorability) The goal of sensitivity analysis is to quantify the degree to which the key assumption of no unmeasured confounders (sequential ignorability) must be violated for a researcher s original conclusion to be reversed

If an inference is sensitive, a slight violation of the assumption may lead to substantively different conclusions Given the importance of sequential ignorability, it has been argued that when observational data are employed some kind of sensitivity analysis should always be carried out Simply put: What happens to my estimated parameters if I simulate the effect of unmeasured confounders?

If only a direct effect is of interest Long history for exposure outcome confounding (as early as the late 50 s) Many techniques for different outcome types, exposure types, methods of primary analysis have been proposed These were generalised and unified by VanderWeele and Arah (Epidemiology, 2011) Stata Episens However, the difference with causal mediation is that indirect effects need to be estimated

Three general scenarios Mediator outcome confounders Exposure mediator confounders Exposure mediator outcome confounders Formal approaches available for the first scenario, but model specific approaches available for the remaining two

Mediator - Outcome U

Exposure mediator U

Exposure mediator - outcome U

When it s about the exposure No formal approach thus far But under certain assumptions we can challenge our parameter estimates We can capitalise on the properties of latent variable measurement models Latent variables capture unobserved heterogeneity Unmeasured confounders can be thought of as sources of unobserved heterogeneity

When the exposure is involved Include latent variable U to represent unmeasured confounder(s) U ~ N (0,1)

The latent variable(s) can represent the effect of one or more confounders The goal is to find out what happens to our estimates under several scenarios that involve latent U It can be shown that under certain assumptions latent variables can imitate the effect of observed confounders

A (relatively) simple example U

A simple LSEM All variables continuous and normally distributed No other confounders other than U Linear associations No interactions (although they could be accommodated) Estimation with MLR

First the parameter estimates without the confounder 0.396 0.208 0.589 0.192 Y on X via W = 0.113 ( 0.124 0.101)

Here comes the (observed) confounder! 0.396 0.208 0.433 0.121 0.369 0.260 0.228 OC Y on X via W = 0.052 ( 0.063 0.044)

Can a latent variable do the same? It can be shown that if we fix the intercepts, slopes (loadings) and variance of the latent variable according to the estimated parameters we can obtain the estimates from the previous model X = Ax + λxu + ex W = AW + λwu + ew Y = AY + λyu + ey

Estimates with the latent confounder 0.396 0.208 0.432 0.124 0.369 0.260 0.228 U Y on X via W = 0.053 ( 0.064 0.042)

Two possibilities a) The researcher suspects a set of unknown confounders b) A well known confounder, or a set of well known confounders have not been measured

Frequentist approach By specifying values for the effect of the confounder(s), the researcher will be able to test several scenarios of weak/moderate/strong confounding An iterative process The results of the trials can be quantified

Weak/No confounding 0.396 0.208 0.522 0.181 0.050 0.050 0.050 U Y on X via W = 0.094 ( 0.108 0.086)

Strong confounding 0.396 0.207 0.064 0.031 0.700 0.700 0.700 U Y on X via W = 0.001 ( 0.011 0.022)

A scree plot Finding the tipping point 0,12 Y on X via W 0,1 0,08 0,06 0,04 0,02 0 0,1 6E 16 0,1 0,2 0,3 0,4 0,5 0,6 0,7 U

Let s go Bayesian U is a well known confounder It s associations with X,W and Y have been quantified in the existing literature Hence, we use informative priors for the parameters that link U with X,W and Y UX ~ N (0.37, 0.01) UW ~N (0.26, 0.01) UY ~ N ( 0.23, 0.01)

0.396 0.207 0.476 0.173 0.433 0.244 0.277 U Y on X via W = 0.082 ( 0.099 0.014)

Limitations Not non parametrically identified (i.e. results depend on the distribution of the latent confounder) No stopping rule can t be falsified Latent confounder can only be normally distributed Discrete latent variables possible, but a lot more work is needed

Advantages Properties of LVMs are well known Software availability DAG theory can be used to inform the sensitivity analyses

Mediator Outcome Confounding Medsens (Stata, R, Mplus) Employs the correlation (Rho) between the residual variances (errors) of the models for the mediator and outcome Effects are computed given different fixed values of the residual covariance. The proposed sensitivity analysis asks the question of how large does Rho have to be for the mediation effect (Average Causal Mediation Effect ACME) to disappear

Medsens example U 0.34 0.42 0.063 X to Y via M = 0.14 (0.11 0.17)

Medsens Results ACME( ) Average mediation effect -.5 0.5 1-1 -.5 0.5 1 Sensitivity parameter: 95% Conf. Interval

Limitations Assumes all confounding due to Rho Only available for mediator outcome associations Accommodates continuous mediator and continuous/ binary outcome, and binary mediator and continuous outcome Assumes normal distribution of error terms

Advantages No distributional assumption for the unmeasured confounder Can accommodate binary, ordinal outcomes Quintile regression (for the outcome model) also available (only in R) Easy to use software

Summary Under certain assumptions latent variables have the potential to imitate the effect of unmeasured confounders Medsens is a very useful tool to test the effects of mediator outcome confounders Both approaches mostly effective in research areas (like the study of health inequalities) with strong theoretical underpinnings that can inform parameter specification/interpretation

Early life SEP An empirical example Later life SEP Later life Health Early Life Health

Early life SEP Chains of risk Later life SEP Later life Health

Early life SEP Later life Health Childhood/Early life

Early life SEP Accumulation Later life SEP Later life Health

Social drift Later life SEP Later life Health Early Life Health

Aims The major aim of the present study is to test the relative contribution of these hypotheses to later life health inequalities A first step in understanding the mechanism that underlies the association between SEP and health We need to assign reliable parameters to all these hypotheses it s all about mediation!!

Sample We used data from the English Longitudinal Study of Ageing (ELSA), a nationally representative multi purpose sample of the population aged 50 and over living in England We analysed a partially incomplete dataset (N = 7758), in which participants were included if they had at least one non missing observation in early life SEP indicators (ELSA Life history interview) Stratified by gender and age group (50 64, 65 74, 75+)

Statistical modelling The specification of each of the latent dimensions was carried out with models appropriate for combinations of binary, ordinal and continuous indicators A LSEM was then estimated in order to jointly model the predictors, mediators and health outcomes (adjusted for confounders) Preliminary results showed no evidence of interactions Missing data on Wave 4 mediators and health outcomes handled with FIML assuming a MAR mechanism Estimation with MLR in Mplus 6.12

Early life SEP LSEM Parameterisation a b Later life SEP c Later life health d Early Life Health Childhood/Early life = a Chains of risk = b*c Accumulation = a + c Social drift = d*c Total effect of Early life SEP = a + (b*c)

Results - Physical Health

Sensitivity analysis Cool parameter estimates, but sequential ignorability implies no unmeasured confounders Sufficiently approximated* for the later life SEP health association, but not for other parts of the diagram No data on parental characteristics such as cognitive ability and health status Results could reflect the effect of unmeasured parental characteristics

Sensitivity Analysis Strong confounding scenario Early life SEP 0.15 0.3 U 0.2 Later life SEP Physical health Fibrinogen 0.3 Early Life Health

Physical health sensitivity analysis

*Sufficiently approximated? We have included possible confounders of the later life SEP physical health/fibrinogen associations, but how about running some more sensitivity analyses?

Medsens Results ACME( ) Average mediation effect -.5 0.5 1-1 -.5 0.5 1 Sensitivity parameter: 95% Conf. Interval

So? Observational data Causal inference a nearly alchemic task However, sensitivity analyses where confounders were simulated supported our results but bias due to unknown unmeasured confounders cannot be ruled out

Summary Sensitivity analysis not a substitute for randomisation Be aware of the assumptions and limitations of all sensitivity analysis approaches But especially when estimation of indirect effects (mediation) is required... Always carry out sensitivity analysis!!

Thank you!