Getting ready for propensity score methods: Designing non-experimental studies and selecting comparison groups

Similar documents
Recent advances in non-experimental comparison group designs

Analysis methods for improved external validity

Causal Validity Considerations for Including High Quality Non-Experimental Evidence in Systematic Reviews

Propensity Score Analysis Shenyang Guo, Ph.D.

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Quantitative Methods. Lonnie Berger. Research Training Policy Practice

Overview of Perspectives on Causal Inference: Campbell and Rubin. Stephen G. West Arizona State University Freie Universität Berlin, Germany

Chapter 21 Multilevel Propensity Score Methods for Estimating Causal Effects: A Latent Class Modeling Strategy

Regression Discontinuity Analysis

in Longitudinal Studies

Introduction to Observational Studies. Jane Pinelis

Can Quasi Experiments Yield Causal Inferences? Sample. Intervention 2/20/2012. Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University

Propensity Score Methods to Adjust for Bias in Observational Data SAS HEALTH USERS GROUP APRIL 6, 2018

Experimental Methods in Health Research A. Niroshan Siriwardena

IMPORTANCE OF AND CRITERIA FOR EVALUATING EXTERNAL VALIDITY. Russell E. Glasgow, PhD Lawrence W. Green, DrPH

RESEARCH METHODS. Winfred, research methods, ; rv ; rv

HEPATITIS C LESSONS PART 4

RESEARCH METHODS. Winfred, research methods,

Choosing the right study design

Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods

WORKING PAPER SERIES

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Challenges of Observational and Retrospective Studies

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Psychology of Dysfunctional Behaviour RESEARCH METHODS

In this second module in the clinical trials series, we will focus on design considerations for Phase III clinical trials. Phase III clinical trials

Lecture 4: Research Approaches

TRIPLL Webinar: Propensity score methods in chronic pain research

Research Design. Beyond Randomized Control Trials. Jody Worley, Ph.D. College of Arts & Sciences Human Relations

Gathering. Useful Data. Chapter 3. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Methods for Addressing Selection Bias in Observational Studies

The Nonuse, Misuse, and Proper Use of Pilot Studies in Education Research

QUASI-EXPERIMENTAL HEALTH SERVICE EVALUATION COMPASS 1 APRIL 2016

Which Comparison-Group ( Quasi-Experimental ) Study Designs Are Most Likely to Produce Valid Estimates of a Program s Impact?:

Measuring impact. William Parienté UC Louvain J PAL Europe. povertyactionlab.org

Conducting Strong Quasi-experiments

Lecture 5 Conducting Interviews and Focus Groups

Beyond the intention-to treat effect: Per-protocol effects in randomized trials

CASE STUDY 2: VOCATIONAL TRAINING FOR DISADVANTAGED YOUTH

ISPOR Good Research Practices for Retrospective Database Analysis Task Force

Simple Sensitivity Analyses for Matched Samples Thomas E. Love, Ph.D. ASA Course Atlanta Georgia

26:010:557 / 26:620:557 Social Science Research Methods

Our plan for giving better care to people with dementia Oxleas Dementia

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

EC352 Econometric Methods: Week 07

Complier Average Causal Effect (CACE)

Threats to validity in intervention studies. Potential problems Issues to consider in planning

PLS 506 Mark T. Imperial, Ph.D. Lecture Notes: Reliability & Validity

CSC2130: Empirical Research Methods for Software Engineering

Methodological requirements for realworld cost-effectiveness assessment

Instrumental Variables Estimation: An Introduction

A Practical Guide to Getting Started with Propensity Scores

Designing and Analyzing RCTs. David L. Streiner, Ph.D.

Lesson 8 STD & Responsible Actions

Psych 1Chapter 2 Overview

Rapid appraisal of the literature: Identifying study biases

RESEARCH RESEARCH RESEARCH Answering the Questions of Society Utilizing the Sociological Research Methodology

04/12/2014. Research Methods in Psychology. Chapter 6: Independent Groups Designs. What is your ideas? Testing

Political Science 15, Winter 2014 Final Review

Researchers often use observational data to estimate treatment effects

UNDERSTANDING EPIDEMIOLOGICAL STUDIES. Csaba P Kovesdy, MD FASN Salem VA Medical Center, Salem VA University of Virginia, Charlottesville VA

Further data analysis topics

The Logic of Data Analysis Using Statistical Techniques M. E. Swisher, 2016

Public Policy & Evidence:

20. Experiments. November 7,

ISC- GRADE XI HUMANITIES ( ) PSYCHOLOGY. Chapter 2- Methods of Psychology

Chapter 1 Review Questions

Internal Validity and Experimental Design

Psychiatrists Views on Assisted Dying and Psychiatric Patients: a Qualitative Study

Design and Analysis of Observational Studies

Chapter 5: Producing Data

Impact Evaluation Toolbox

Math 140 Introductory Statistics

Chapter 11 Nonexperimental Quantitative Research Steps in Nonexperimental Research

EXPERIMENTAL METHODS IN PSYCHOLINGUISTIC RESEARCH SUMMER SEMESTER 2015

Assessing the impact of unmeasured confounding: confounding functions for causal inference

Research Design & Protocol Development

Clever Hans the horse could do simple math and spell out the answers to simple questions. He wasn t always correct, but he was most of the time.

Chapter 13 Summary Experiments and Observational Studies

Study Design. Svetlana Yampolskaya, Ph.D. Summer 2013

Instrumental Variables I (cont.)

Mixed Methods Study Design

Research in Physical Medicine and Rehabilitation

We now look at the remaining standards for specific study designs and methods, in particular the causal inference standards.

Chapter 13. Experiments and Observational Studies. Copyright 2012, 2008, 2005 Pearson Education, Inc.

Psychology: The Science

Fixed Effect Combining

Other Designs. What Aids Our Causal Arguments? Time Designs Cohort. Time Designs Cross-Sectional

Combining the regression discontinuity design and propensity score-based weighting to improve causal inference in program evaluationjep_

Overview of Clinical Study Design Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine US National

From Description to Causation

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum

Propensity Score Matching with Limited Overlap. Abstract

Sober living houses are frequently a large, converted, residential home.

Evidence-Based Practice: Where do we Stand?

Causal Inference over Longitudinal Data to Support Expectation Exploration. Emre Kıcıman Microsoft Research

Bradford Hill Criteria for Causal Inference Based on a presentation at the 2015 ANZEA Conference

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Designed Experiments have developed their own terminology. The individuals in an experiment are often called subjects.

Analysis A step in the research process that involves describing and then making inferences based on a set of data.

Transcription:

Getting ready for propensity score methods: Designing non-experimental studies and selecting comparison groups Elizabeth A. Stuart Johns Hopkins Bloomberg School of Public Health Departments of Mental Health and Biostatistics estuart@jhsph.edu www.biostat.jhsph.edu/ estuart Funding thanks to NIMH 1K25MH83646-01 September 6, 2012 Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 1 / 23

Outline 1 Introduction 2 Considerations The treatment assignment mechanism Comparability Measurement 3 Possible solutions Multiple control groups Internal and external controls Historical data Instrumental variables/natural experiments 4 Conclusions Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 2 / 23

Outline 1 Introduction 2 Considerations The treatment assignment mechanism Comparability Measurement 3 Possible solutions Multiple control groups Internal and external controls Historical data Instrumental variables/natural experiments 4 Conclusions Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 3 / 23

Where does the data come from? When written up, the source of the data often taken as a given We compared treated and comparison individuals... But where the data comes from, and what it contains, often the most important step Increasing availability of public use, large-scale datasets; can it be useful? Will outline some of the things to think about at this initial design stage of a study Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 4 / 23

The context for my talk Have data on a treated/exposed group Want to estimate the effect of that treatment, relative to some comparison condition (which may not yet be fully defined) Causal effect is comparison of potential outcomes between two treatment conditions Can t just look, e.g., at change over time in treatment group Don t have a clear comparison group identified, also don t have an obvious instrumental variable So hope to use a propensity score method to estimate the treatment effect In that case, what do you need to think about up front? (Lots of connections with ideas discussed by Dr. Cook and Dr. Thoemmes) Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 5 / 23

Careful design of observational studies Broader discussions of threats to validity, careful design of observational studies Think creatively about comparisons, make crisp comparisons e.g., Kuramoto et al. (2011): In estimating the effect of having a parent die from suicide, compare to people whose parents died in an accident e.g., Rosenbaum (2009): When looking at effects of road features on accidents, compare conditions where accident happened with the conditions for the same car 1 mile before the accident Shadish, Cook, and Campbell (2002), Rosenbaum (2009, 2011; design sensitivity ) Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 6 / 23

Outline 1 Introduction 2 Considerations The treatment assignment mechanism Comparability Measurement 3 Possible solutions Multiple control groups Internal and external controls Historical data Instrumental variables/natural experiments 4 Conclusions Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 7 / 23

Understanding the treatment assignment process To have confidence in causal statements, need to understand the assignment mechanism What led some people to get the treatment and others not? Who made the decision, and what was it based on? And do we observe those characteristics? Can we adjust for them? Propensity score approaches will be more appropriate if we observe the factors that went into this decision Do qualitative interviews with decision makers : Great setting for a mixed methods study? Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 8 / 23

Comparability of treated and comparison subjects Want to identify comparison subjects likely to be similar to treated subjects Ideally on observed and unobserved characteristics Careful selection can reduce worries about unobserved confounding e.g., from same geographic area e.g., satisfy eligibility criteria (Think back to car accident example) Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 9 / 23

Availability of appropriate measures Most propensity score methods rely on assumption of no unmeasured confounders Given the observed characteristics, there are no unobserved variables related to treatment and outcome Methods won t work very well if all you have are basic demographics Most important: pre-treatment measures of the outcomes And of course measures need to be the same across the treatment and comparison groups Having lots of intensive data on just the treatment group won t help; need it on comparison subjects too Steiner et al. (2010) Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 10 / 23

Outline 1 Introduction 2 Considerations The treatment assignment mechanism Comparability Measurement 3 Possible solutions Multiple control groups Internal and external controls Historical data Instrumental variables/natural experiments 4 Conclusions Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 11 / 23

Multiple control groups and/or biases of a known direction Consider a setting where treated individuals must meet some eligibility criteria Worry that those not eligible likely quite different from those who are (e.g., less disadvantaged) But those who are eligible but don t participate may be quite different from those who do participate (e.g., less motivated) So what to do? Potential solution: multiple comparison groups Works when you can formulate hypotheses about the ordering of outcomes (i.e., hypotheses about the directions of any biases) e.g., treatment group should perform better than those who were eligible but didn t participate, but worse (or the same as) those who weren t eligible Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 12 / 23

Rosenbaum (2009): From Bernanke (1995), Did adherence to the gold standard lengthen the Great Depression?... countries that left gold relatively early had largely recovered from the Depression, while Gold Block countries remained at low levels of output and employment... Of course, in practice the decision about whether to leave the gold standard was endogenous to a degree... [Any] bias created by endogeneity of the decision to leave gold would appear to go the wrong way... The presumption is that economically weaker countries... would be the first to devalue or abandon gold. Yet the evidence is that countries leaving gold recovered substantially more rapidly and vigorously than those who did not. Hence, any correction for endogeniety... should tend to strengthen the association of economics expansion and the abandonment of gold. Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 13 / 23

SDDAP: Using internal and external controls Estimating the effect of a school-wide dropout prevention program (SDDAP) Have kids in 5 treated schools, as well as data on kids from paired comparison schools in the same communities The catch: The kids in the treated and their comparison schools don t always look similar on student characteristics Trade off between good matches on individual-level characteristics vs. on local area-level characteristics Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 14 / 23

Rubin and Stuart (2008) use a combination of local and non-local matches Get a local match when they look good, but get a non-local match if no good local match Can also adjust for local/non-local difference by comparing local and non-local comparison students Can also estimate optimal number of local matches Solution depends on how close the local groups are and on relative importance of individual level covariates and area indicator in terms of outcome Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 15 / 23

Genzyme: Using historical data FDA submission for approval of Fabrazyme Fabry disease very rare The goal: Extend double-blind randomized trial to look at longer-term effects The challenge: No existing treatment, so for ethical reasons drug became open-label after showing initial promise The solution: Use historical patients as comparison subjects (using historical patient registry) Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 16 / 23

Propensity score matching with historical data Utilize two data sources: Group randomized to receive Fabrazyme in the double blind trial Historical control data set with information on historical patients with Fabry disease (primarily clinical data) Use propensity score matching to select historical control subjects who look like the treatment group at the time they started Fabrazyme Restrict historical subjects to those that would have met inclusion/exclusion criteria for trial Matching done on variables selected by clinicians as being important for prognosis Compare long term outcomes of the treatment group with their matched comparison subjects Worked well here in part because treatment for Fabry (and prognosis) had not otherwise changed much over time and measures were the same in the two groups Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 17 / 23

A related complication: defining baseline When longitudinal data available on comparison subjects, may not be clear what to use as their baseline Need some way of determining which time points correspond to baseline and which to outcomes Would like to identify the point in time they could have started treatment but didn t Sometimes can use calendar time as an anchor In Genzyme example, selected the point in time each comparison subject looked the most like someone in the trial Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 18 / 23

Utilizing natural experiments Of course sometimes we may be able to use a natural experiment Some randomization occurring by chance in nature e.g., use Dutch famine of 1944-1945 to examine effects of malnutrition in utero on later mental performance (Rosenbaum, 2009) Big literature on this; won t focus on it here Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 19 / 23

Outline 1 Introduction 2 Considerations The treatment assignment mechanism Comparability Measurement 3 Possible solutions Multiple control groups Internal and external controls Historical data Instrumental variables/natural experiments 4 Conclusions Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 20 / 23

Rosenbaum, 2005: Some overall lessons in design: Try to have... Covariates and outcomes available for treated and control groups, ideally with temporal ordering Haphazard treatment assignment rather than self-selection (e.g., Maimonides rule, car crashes where not at fault) Special populations offering reduced self-selection (e.g., violence rates among felons) Biases of known direction (e.g., a group known to be less disabled than the treated group) An abrupt start to intense treatments (e.g., car crashes) Additional structural features in quasi-experiments intended to provide information about hidden biases. Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 21 / 23

Lessons Good selection of comparison groups requires creativity Don t just take the easy data Make sure there are comparable measures of the important confounders Think about finding comparison group data that helps control for unobserved variables Think about whether factors like geography, calendar time matter Make a story Compare multiple groups Examine multiple outcomes, some that should/shouldn t be affected by treatment ( zero checks ) Elaborate theories Value transparency Think of a hostile critic Replication: In multiple settings, slightly different ways of looking at the question Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 22 / 23

References Imai, K., King, G., and Stuart, E.A. (2008). Misunderstandings between experimentalists and observationalists about causal inference. Journal of the Royal Statistical Society, Series A 171: 481-502. Rosenbaum, P.R. (2009). Design of Observational Studies. Springer Verlag, New York, NY. Shadish, W.R., Cook, T.D., and Campbell, D.T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin Company. Steiner, P.M., Cook, T.D., Shadish, W.R., and Clark, M.H. (2010). The importance of covariate selection in controlling for selection bias in observational studies. Psychological Methods 15(3): 250-267. Stuart, E.A. (2010). Matching Methods for Causal Inference: A review and a look forward. Statistical Science 25(1): 1-21. Stuart, E.A. and Rubin, D.B. (2008). Matching with multiple control groups and adjusting for group differences. Journal of Educational and Behavioral Statistics 33(3): 279-306. Fabrazyme example: http://www.fda.gov/ohrms/dockets/ac/03/briefing/3917b1 01 Genzyme.pdf Elizabeth Stuart (JHSPH) Matching Methods September 6, 2012 23 / 23