How should the propensity score be estimated when some confounders are partially observed?

Similar documents
Module 14: Missing Data Concepts

Strategies for handling missing data in randomised trials

ethnicity recording in primary care

Dr Elizabeth Williamson. Medical Statistics Department Faculty of Epidemiology & Population Health. CSM Big Data Symposium, 2017

Analysis of TB prevalence surveys

Bayesian approaches to handling missing data: Practical Exercises

Propensity Score Methods to Adjust for Bias in Observational Data SAS HEALTH USERS GROUP APRIL 6, 2018

Assessing the impact of unmeasured confounding: confounding functions for causal inference

Donna L. Coffman Joint Prevention Methodology Seminar

Bias in regression coefficient estimates when assumptions for handling missing data are violated: a simulation study

Imputation approaches for potential outcomes in causal inference

Help! Statistics! Missing data. An introduction

Logistic Regression with Missing Data: A Comparison of Handling Methods, and Effects of Percent Missing Values

George B. Ploubidis. The role of sensitivity analysis in the estimation of causal pathways from observational data. Improving health worldwide

What to do with missing data in clinical registry analysis?

Exploring the Impact of Missing Data in Multiple Regression

Multiple imputation for handling missing outcome data when estimating the relative risk

PubH 7405: REGRESSION ANALYSIS. Propensity Score

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Missing Data and Imputation

Analysis Strategies for Clinical Trials with Treatment Non-Adherence Bohdana Ratitch, PhD

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Propensity Scores; Generalising Results from RCT s

Sensitivity Analysis in Observational Research: Introducing the E-value

Methods to control for confounding - Introduction & Overview - Nicolle M Gatto 18 February 2015

Estimating Direct Effects of New HIV Prevention Methods. Focus: the MIRA Trial

Alternative indicators for the risk of non-response bias

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Master thesis Department of Statistics

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

BIOSTATISTICAL METHODS

Appendix 1. Sensitivity analysis for ACQ: missing value analysis by multiple imputation

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Comparison of imputation and modelling methods in the analysis of a physical activity trial with missing outcomes

Regression Discontinuity Designs: An Approach to Causal Inference Using Observational Data

Propensity Score Analysis Shenyang Guo, Ph.D.

The analysis of tuberculosis prevalence surveys. Babis Sismanidis with acknowledgements to Sian Floyd Harare, 30 November 2010

Methods for treating bias in ISTAT mixed mode social surveys

Practice of Epidemiology. Strategies for Multiple Imputation in Longitudinal Studies

Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods

Manitoba Centre for Health Policy. Inverse Propensity Score Weights or IPTWs

Declaration of interests. Register-based research on safety and effectiveness opportunities and challenges 08/04/2018

PSI Missing Data Expert Group

Methods for Addressing Selection Bias in Observational Studies

Comparisons of Dynamic Treatment Regimes using Observational Data

16:35 17:20 Alexander Luedtke (Fred Hutchinson Cancer Research Center)

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

Moving beyond regression toward causality:

Inclusive Strategy with Confirmatory Factor Analysis, Multiple Imputation, and. All Incomplete Variables. Jin Eun Yoo, Brian French, Susan Maller

Propensity scores: what, why and why not?

Published online: 27 Jan 2015.

An Introduction to Multiple Imputation for Missing Items in Complex Surveys

BLISTER STATISTICAL ANALYSIS PLAN Version 1.0

Recent advances in non-experimental comparison group designs

Analysis methods for improved external validity

An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies

By: Mei-Jie Zhang, Ph.D.

Proper analysis in clinical trials: how to report and adjust for missing outcome data

Accuracy of Range Restriction Correction with Multiple Imputation in Small and Moderate Samples: A Simulation Study

Approaches to Improving Causal Inference from Mediation Analysis

Instrumental Variables Estimation: An Introduction

Missing by Design: Planned Missing-Data Designs in Social Science

Complier Average Causal Effect (CACE)

Maintenance of weight loss and behaviour. dietary intervention: 1 year follow up

Challenges of Observational and Retrospective Studies

Lecture 1: Introduction to Personalized Medicine. Donglin Zeng, Department of Biostatistics, University of North Carolina

Quantitative Methods. Lonnie Berger. Research Training Policy Practice

In this module I provide a few illustrations of options within lavaan for handling various situations.

Time-varying confounding and marginal structural model

Introduction to Observational Studies. Jane Pinelis

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Analyzing diastolic and systolic blood pressure individually or jointly?

Graphical Representation of Missing Data Problems

Matching methods for causal inference: A review and a look forward

THE USE OF NONPARAMETRIC PROPENSITY SCORE ESTIMATION WITH DATA OBTAINED USING A COMPLEX SAMPLING DESIGN

Measurement Error in Nonlinear Models

Propensity score methods : a simulation and case study involving breast cancer patients.

Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods

Section on Survey Research Methods JSM 2009

Propensity Score Analysis to compare effects of radiation and surgery on survival time of lung cancer patients from National Cancer Registry (SEER)

SUNGUR GUREL UNIVERSITY OF FLORIDA

A Potential Outcomes View of Value-Added Assessment in Education

Practical Statistical Reasoning in Clinical Trials

Current Directions in Mediation Analysis David P. MacKinnon 1 and Amanda J. Fairchild 2

TRIPLL Webinar: Propensity score methods in chronic pain research

Bayesian methods for combining multiple Individual and Aggregate data Sources in observational studies

Sequential nonparametric regression multiple imputations. Irina Bondarenko and Trivellore Raghunathan

SESUG Paper SD

CAUSAL INFERENCE IN HIV/AIDS RESEARCH: GENERALIZABILITY AND APPLICATIONS

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Mediation Analysis With Principal Stratification

Running head: SELECTION OF AUXILIARY VARIABLES 1. Selection of auxiliary variables in missing data problems: Not all auxiliary variables are

On Missing Data and Genotyping Errors in Association Studies

Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed.

Michael L. Johnson, PhD, 1 William Crown, PhD, 2 Bradley C. Martin, PharmD, PhD, 3 Colin R. Dormuth, MA, ScD, 4 Uwe Siebert, MD, MPH, MSc, ScD 5

Transcription:

How should the propensity score be estimated when some confounders are partially observed? Clémence Leyrat 1, James Carpenter 1,2, Elizabeth Williamson 1,3, Helen Blake 1 1 Department of Medical statistics, LSHTM, London 2 MRC Clinical Trials Unit at UCL, London 3 Farr Institute of Health Informatics, London MRC project grant MR/M013278/1 Missing in PS analysis CSM Seminar - September 30th, 2016 1/44

Content Propensity score Missing 3 ways to handle missing for PS analysis Complete case analysis The missingness pattern approach Multiple imputation Real life example Recommendations/conclusion Missing in PS analysis CSM Seminar - September 30th, 2016 2/44

The problem of confounding Observational studies are a useful source of information to establish causal effects of a treatment/exposure on a health-related outcome Because of the lack of randomisation, groups may be unbalanced = Risk of confounding bias T: treatment Y: outcome X: confounder Propensity score PS and missing Missingness mechanisms Propensity scores (PS) proposed in 1983 to balance groups in observational studies Missing in PS analysis CSM Seminar - September 30th, 2016 3/44

The propensity score The PS is the individual s probability of receiving the treatment rather than the control conditionally to their baseline characteristics e(x) = P(T = 1 X = x) The true value of the PS is unknown but can be estimated: = individual predictions from a logistic model Covariates to be included: true confounders Propensity score PS and missing Missingness mechanisms risk factors Missing in PS analysis CSM Seminar - September 30th, 2016 4/44

Assumptions required 3 assumptions required to estimate unbiased causal effects using the PS: Positivity: each patient has a non null probability of receiving the treatment or the control SITA (conditional exchangeability): no unmeasured confounders SUTVA (consistency): the potential outcome for a patient is not affected by the treatment received by the other patients the treatment has always the same effect on a given patient Propensity score PS and missing Missingness mechanisms Under these assumptions, the PS is a balancing score Missing in PS analysis CSM Seminar - September 30th, 2016 5/44

PS-based approaches Propensity score PS and missing Missingness mechanisms Missing in PS analysis CSM Seminar - September 30th, 2016 6/44

The issue of missing If some confounders are partially observed, the PS cannot be estimated for individuals without a complete record Propensity score PS and missing Missingness mechanisms Missing in PS analysis CSM Seminar - September 30th, 2016 7/44

Missingness mechanisms The and analysis strategy depend on the association between the missing value and observed and unobserved variables, the missingness mechanism Following Rubin s taxonomy, missing confounders can be: MCAR (missing completely at random) MAR (missing at random) MNAR (missing not at random) Propensity score PS and missing Missingness mechanisms What can be done? Missing in PS analysis CSM Seminar - September 30th, 2016 8/44

Strategies investigated Focus on 3 approaches (with a binary outcome) for IPTW: Complete case analysis The missingness pattern approach Multiple imputation For each of them: What are the assumptions required? What is the best way to implement the method? Complete cases The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 9/44

In practice A quick check of the literature showed that, among 132 identified papers: 46% used complete case analysis 5% used the missingness pattern approach 36% used multiple imputation A systematic review would be needed for a better overview of the different methods implemented in practice. Complete cases The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 10/44

Complete case analysis Complete cases Logistic regression Methods Results Summary The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 11/44

CC for multivariable regression Complete case (CC) analysis: analysis on the subgroup of patients with complete records: Loss of efficiency because of a loss in sample size Risk of bias of the treatment effect estimate CC analysis leads to an unbiased estimate: when are MCAR when missingness does not depend on Y and T in the context of multivariable logistic regression Complete cases Logistic regression Methods Results Summary The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 12/44

CC for multivariable regression (2) J.W. Bartlett, O. Harel, and J.R. Carpenter. Asymptotically Unbiased Estimation of Exposure Odds Ratios in Complete Records Logistic Regression. American Journal of Epidemiology. 2015 Oct 15;182(8):730-6. Are these results generalizable to PS analysis? Complete cases Logistic regression Methods Results Summary The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 13/44

Setting: n=10000, binary outcome Y, binary treatment T, and two binary confounders C 1 and C 2 R is the complete case indicator (R=1 if complete case, 0 otherwise) Comparison of 3 approaches: Multivariable logistic regression to estimate the conditional OR Multivariable logistic regression to estimate the marginal OR IPTW to estimate the marginal OR Complete cases Logistic regression Methods Results Summary The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 14/44

Results Bias of log(or). ORcond=2 Complete cases Logistic regression Methods Results Summary The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 15/44

Results (2) Complete cases Logistic regression Methods Results Summary The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 16/44

CC: a bad idea CC not suitable for the estimation of marginal effects (both with PS and logistic regression) unless: MCAR mechanism missingness not associated with both Y and Z AND under H0!! In the literature CC seems to be the most common approach for PS analysis... What else can be done? Complete cases Logistic regression Methods Results Summary The missingness pattern approach Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 17/44

The missingness pattern approach Helen Blake s PhD research Complete cases The missingness pattern approach The method msita Summary Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 18/44

The missingness pattern approach Proposed by Rosenbaum and Rubin (1984) and D Agostino and Rubin (2000) Definition of a generalized PS estimated within each pattern of missingness Complete cases The missingness pattern approach The method msita Summary Multiple imputation Relies on an additional assumption: an extension of SITA Missing in PS analysis CSM Seminar - September 30th, 2016 19/44

The SITA extension assumption Let X the vector of baseline confounders be split in X = {X obs, X mis } and R the vector of the missingness indicators for the confounders. Classical SITA assumption: the potential outcomes and the treatment assignment are independent given the measured characteristics (no unmeasured confounders): SITA extension (Mattei): (Y 0, Y 1 ) T X (Y 0, Y 1 ) T X, R and either X mis T X obs, R or X mis (Y 0, Y 1 ) X obs, R Complete cases The missingness pattern approach The method msita Summary Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 20/44

The assumption in practice For the assumption to hold, X can be a confounder when observed but not when missing Complete cases The missingness pattern approach The method msita Summary Multiple imputation Assumption required because the generalised PS balances the observed part of the covariates only (but not the missing part) Missing in PS analysis CSM Seminar - September 30th, 2016 21/44

Link with Rubin s taxonomy It s been shown that: if the SITA assumption extension does not hold: invalid inferences even under MCAR if the SITA assumption extension holds: valid inferences even under some MNAR mechanisms Promising approach that requires further investigation to be applicable in a variety of situations combining MAR and MNAR Complete cases The missingness pattern approach The method msita Summary Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 22/44

Summary The missingness pattern approach: can lead to valid inferences if the SITA assumption extension holds could be of interest for some MNAR mechanisms is quite straightforward However: requires a large sample size difficulties arise with a lot of patterns = Pooling? has a specific applicability in its present form = Combining MPA with other methods? Complete cases The missingness pattern approach The method msita Summary Multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 23/44

Multiple imputation Complete cases The missingness pattern approach Multiple imputation Principle Strategies Literature Balancing properties Consistency Summary Missing in PS analysis CSM Seminar - September 30th, 2016 24/44

Multiple imputation for PS Aim: create M complete sets to estimate the PS for each participant and apply Rubin s rules to obtain a treatment effect estimate Two key questions: Should the outcome be included in the imputation model? = PS paradigm Missing paradigm How to apply Rubin s rules? = pooled treatment effect or pooled PS? Complete cases The missingness pattern approach Multiple imputation Principle Strategies Literature Balancing properties Consistency Summary Missing in PS analysis CSM Seminar - September 30th, 2016 25/44

What should we combine? Complete cases The missingness pattern approach Multiple imputation Principle Strategies Literature Balancing properties Consistency Summary ˆθ: treatment effect estimate Missing in PS analysis CSM Seminar - September 30th, 2016 26/44

In the literature... Existing studies: Mitra & Reiter 1 : for PS matching, MIps>MIte but opposite conclusion for IPTW = Outcome not included in the imputation model Hill 2 : MIte>MIps and outcome in the imputation model = PS matching only but no theoretical arguments about the validity of these estimators when are MAR 1 Mitra R, Reiter JP. A comparison of two methods of estimating propensity scores after multiple imputation. SMMR. 2016 Feb;25(1):188-204. 2 Hill J. Reducing Bias in Treatment Effect Estimation in Observational Studies Suffering from Missing Data; 2004. ISERP working paper 04-01. Complete cases The missingness pattern approach Multiple imputation Principle Strategies Literature Balancing properties Consistency Summary Missing in PS analysis CSM Seminar - September 30th, 2016 27/44

Balancing properties Are the 3 estimated PS balancing scores? = requirement for valid inferences For MIte, we showed that within each imputed set: For MIps and MIpar: X obs Z e(x obs, X (k) m ) X (k) m Z e(x obs, X (k) m ). the pooled PS is not a function of the covariates the true PS is not a function of the estimated PS = the pooled PS is not a balancing score Complete cases The missingness pattern approach Multiple imputation Principle Strategies Literature Balancing properties Consistency Summary Missing in PS analysis CSM Seminar - September 30th, 2016 28/44

Consistency Consistency comes from the ability of the PS to balance groups: MIps and MIpar are not consistent estimators MIte: Seaman and White: the consistent estimator for an infinite number of imputations In practice: how well these 3 estimators perform? Complete cases The missingness pattern approach Multiple imputation Principle Strategies Literature Balancing properties Consistency Summary Missing in PS analysis CSM Seminar - September 30th, 2016 29/44

Summary Different ways to apply Rubin s rules after MI of the partially observed covariates for IPTW = MIte only is a consistent estimator of the treatment effect (MAR mechanism) results found in the literature are not clear so need to empirically assess these methods: variance estimation? outcome in the imputation model? strength of the bias for MIps and MIpar Complete cases The missingness pattern approach Multiple imputation Principle Strategies Literature Balancing properties Consistency Summary Missing in PS analysis CSM Seminar - September 30th, 2016 30/44

Data generation Analysis Results Missing in PS analysis CSM Seminar - September 30th, 2016 31/44

plan Observational : estimation of the effect of a binary treatment T on a binary outcome Y (RR), n=5000 3 confounders (2 with 30% of missing) Multiple imputation: Chained equations (FCS) M=10 Imputation model: X 1, X 2, X 3, T, Y Y: binary outcome T: treatment R: missingness indicator Xobs: observed confounders Xmiss: missing confounders Data generation Analysis Results Missing in PS analysis CSM Seminar - September 30th, 2016 32/44

Analysis strategies IPTW estimator: Estimation of the weighted marginal proportions ˆP 0 and ˆP 1 and RR = ˆP 1 ˆP 0 Use of Williamson et al. 1 variance estimator for IPTW (two-step estimator) Compared approaches: Complete case: exclusion of participants with partial Missingness pattern: 4 different PS models MIte: the M IPTW estimates of the treatment effect are pooled according Rubin s rules MIps: 1 IPTW estimate obtained from the average PS Data generation Analysis Results MIpar: 1 IPTW estimate obtained from the PS of the average covariates 1 Williamson EJ, Forbes A, White IR. Variance reduction in randomised trials by inverse probability weighting using the propensity score. Stat Med. 2014 Feb;33(5):721-737. Missing in PS analysis CSM Seminar - September 30th, 2016 33/44

Results: bias RR=1, outcome predictor of missingness Similar results with: - RR=2 - missingness not associated with Y Data generation Analysis Results Bias Covariate balance Coverage rate The outcome must be included in the imputation model Pooling the treatment effects (MIte) performs best Missing in PS analysis CSM Seminar - September 30th, 2016 34/44

Balancing properties Standardized differences (in%) between groups: SD = 100 X 1 X 0 s 0 2+s2 1 2 Data generation Analysis Results Bias Covariate balance Coverage rate PS obtained from MP, MIps and MIpar do not balance the missing part of the covariates Missing in PS analysis CSM Seminar - September 30th, 2016 35/44

Coverage rate Data generation Analysis Results Bias Covariate balance Coverage rate Missing in PS analysis CSM Seminar - September 30th, 2016 36/44

Real life example Context Balance Results Missing in PS analysis CSM Seminar - September 30th, 2016 37/44

Data: THIN base (records from GP in the UK) Population: focus on patients with a pneumonia episode, n=9073 (Douglas et al.) Intervention: statins vs no statins Outcome: death within 6 months Confounders: 21 variables (demographic, medical history, treatments) Context Balance Results Missing : body mass index (19.2%), smoking status (6.2%) and alcohol consumption (18.5%) Missing in PS analysis CSM Seminar - September 30th, 2016 38/44

: PS distribution (CC) Context Balance Results Missing in PS analysis CSM Seminar - September 30th, 2016 39/44

: balance Context Balance Results CC: complete case; MP: missingness pattern; MIte: treatment effects combined after multiple imputation; MIps: propensity scores combined after multiple imputation; MIpar: propensity score parameters combined after multiple imputation Missing in PS analysis CSM Seminar - September 30th, 2016 40/44

: results CC: complete case; MP: missingness pattern; MIte: treatment effects combined after multiple imputation; MIps: propensity scores combined after multiple imputation; MIpar: propensity score parameters combined after multiple imputation; RR: relative risk The 3 partially observed covariates are not strong confounders Context Balance Results MP: Need to pool some patterns because of small sample and SITA assumption extension unlikely to be valid Similar results for MI when increasing artificially the missingness rate Missing in PS analysis CSM Seminar - September 30th, 2016 41/44

Recommendations Complete case analysis: bad idea, unless MCAR mechanism Multiple imputation: good statistical properties under a MAR mechanism the treatment effects should be pooled rather than the PSs the outcome must be included in the imputation model The missingness pattern approach: good statistical properties if missing values are not confounders promising technique for MNAR mechanisms Missing in PS analysis CSM Seminar - September 30th, 2016 42/44

Future work Multiple imputation: to the issue of compatibility between the substantive, PS and imputation models to how to assess covariate balance after MI The missingness pattern approach: to combine MPA with MI when both MAR and MNAR mechanisms to how to pool patterns when small sample size to develop a variance estimator Missing in PS analysis CSM Seminar - September 30th, 2016 43/44

Thank you! Missing in PS analysis CSM Seminar - September 30th, 2016 44/44