Propensity Scores; Generalising Results from RCT s

Similar documents
How should the propensity score be estimated when some confounders are partially observed?

Assessing the impact of unmeasured confounding: confounding functions for causal inference

Challenges of Observational and Retrospective Studies

Declaration of interests. Register-based research on safety and effectiveness opportunities and challenges 08/04/2018

How to design Homeopathy clinical randomised trials that

Propensity Score Methods to Adjust for Bias in Observational Data SAS HEALTH USERS GROUP APRIL 6, 2018

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

Comparison of External and Internal Validity Bias in Experimental and Quasi- Experimental Approaches Mark White, Ben Hansen, Tim Lycurgus, Brian Rowan

Analysis methods for improved external validity

CAN EFFECTIVENESS BE MEASURED OUTSIDE A CLINICAL TRIAL?

Rise of the Machines

State of the art pharmacoepidemiological study designs for post-approval risk assessment

Comparisons of Dynamic Treatment Regimes using Observational Data

Reflection Questions for Math 58B

Programme Name: Climate Schools: Alcohol and drug education courses

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Methods to control for confounding - Introduction & Overview - Nicolle M Gatto 18 February 2015

CAUSAL INFERENCE IN HIV/AIDS RESEARCH: GENERALIZABILITY AND APPLICATIONS

4/10/2018. Choosing a study design to answer a specific research question. Importance of study design. Types of study design. Types of study design

Observational comparative effectiveness research using large databases

RWD after RCT. evaluating drugs in daily clinical practice. Stefan Franzén PhD. Helsinki

Time-varying confounding and marginal structural model

A Practical Guide to Getting Started with Propensity Scores

INTRODUCTION TO SURVIVAL CURVES

Influence of Lymphadenectomy on Survival for Early-Stage Endometrial Cancer

Treatment changes in cancer clinical trials: design and analysis

Inverse Probability of Censoring Weighting for Selective Crossover in Oncology Clinical Trials.

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision

Comparing treatments evaluated in studies forming disconnected networks of evidence: A review of methods

Accounting for treatment switching/discontinuation in comparative effectiveness studies

Adjusting overall survival for treatment switch

OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data

Indirect Comparisons: heroism or heresy. Neil Hawkins With acknowledgements to Sarah DeWilde

Effects of propensity score overlap on the estimates of treatment effects. Yating Zheng & Laura Stapleton

Manitoba Centre for Health Policy. Inverse Propensity Score Weights or IPTWs

Methods of Reducing Bias in Time Series Designs: A Within Study Comparison

BIOSTATISTICAL METHODS

Can Quasi Experiments Yield Causal Inferences? Sample. Intervention 2/20/2012. Matthew L. Maciejewski, PhD Durham VA HSR&D and Duke University

THE USE OF NONPARAMETRIC PROPENSITY SCORE ESTIMATION WITH DATA OBTAINED USING A COMPLEX SAMPLING DESIGN

Donna L. Coffman Joint Prevention Methodology Seminar

Moving beyond regression toward causality:

PubH 7405: REGRESSION ANALYSIS. Propensity Score

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Interpreting Prospective Studies

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

University of Bristol - Explore Bristol Research. Publisher's PDF, also known as Version of record

2. How to design Homeopathy clinical randomised. 3. Review of the evidence for homeopathic treatment of insomnia and Fibromyalgia Syndrome

The Logic of Causal Inference. Uwe Siebert, MD, MPH, MSc, ScD

Propensity score analysis with the latest SAS/STAT procedures PSMATCH and CAUSALTRT

T. E. Raghunathan Department of Biostatistics and Research Professor, Institute for Social Research, University of Michigan, Ann Arbor, USA

Recent advances in non-experimental comparison group designs

Supplementary Appendix

Regression Discontinuity Designs: An Approach to Causal Inference Using Observational Data

The impact of treatment line matching on covariates balance and cost effectiveness results: A case study in oncology

Research Design. Beyond Randomized Control Trials. Jody Worley, Ph.D. College of Arts & Sciences Human Relations

By: Mei-Jie Zhang, Ph.D.

Propensity score method: a non-parametric technique to reduce model dependence

Matched Cohort designs.

Outline. Chapter 3: Random Sampling, Probability, and the Binomial Distribution. Some Data: The Value of Statistical Consulting

Propensity scores and causal inference using machine learning methods

Pragmatic trials how pragmatic can we be? Jon Nicholl

Randomized Controlled Trials Shortcomings & Alternatives icbt Leiden Twitter: CAAL 1

A re-randomisation design for clinical trials

Research in Real-World Settings: PCORI s Model for Comparative Clinical Effectiveness Research

Embedding pragmatic trials within databases of electronic health records / disease registries Tjeerd van Staa

Propensity Score Analysis Shenyang Guo, Ph.D.

Using Electronic Health Records Data for Predictive and Causal Inference About the HIV Care Cascade

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Generalizing the right question, which is?

Estimating Direct Effects of New HIV Prevention Methods. Focus: the MIRA Trial

Institute of Medical Epidemiology, Biostatistics, and Informatics, University of Halle-Wittenberg, Halle (Saale) 2

Bayesian methods for combining multiple Individual and Aggregate data Sources in observational studies

Propensity score methods : a simulation and case study involving breast cancer patients.

Pitfalls to Avoid in Observational Studies in the ICU

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Introduction to Observational Studies. Jane Pinelis

Epidemiological study design. Paul Pharoah Department of Public Health and Primary Care

QUASI-EXPERIMENTAL APPROACHES

Tuning Epidemiological Study Design Methods for Exploratory Data Analysis in Real World Data

Objectives. Distinguish between primary and secondary studies. Discuss ways to assess methodological quality. Review limitations of all studies.

Empirical evidence on sources of bias in randomised controlled trials: methods of and results from the BRANDO study

How should clinical trials in brain injury be designed

Institute of Medical Epidemiology, Biostatistics, and Informatics, University of Halle-Wittenberg, Halle (Saale) 2

Simple Linear Regression the model, estimation and testing

Causal Inference for Medical Decision Making

Advanced IPD meta-analysis methods for observational studies

How many Cases Are Missed When Screening Human Populations for Disease?

Handling time varying confounding in observational research

Statistical Hocus Pocus? Assessing the Accuracy of a Diagnostic Screening Test When You Don t Even Know Who Has the Disease

Replicating Randomised Trials of Treatments in Observational Settings Using Propensity Scores Fisher s Aphorisms

Introducing a SAS macro for doubly robust estimation

Data Analysis Using Regression and Multilevel/Hierarchical Models

Comparison of propensity score methods and covariate adjustment:

Essay 3 Advancing quantitative methods for the evaluation of complex interventions

Randomized experiments vs. Propensity scores matching: a Meta-analysis.

ISSUE PANEL 1: PRAGMATIC CLINICAL TRIALS TO ESTIMATE TREATMENT EFFECTS: ARE THEY WORTH THE EFFORT?

Propensity Score Methods with Multilevel Data. March 19, 2014

Joseph W Hogan Brown University & AMPATH February 16, 2010

Strategies for handling missing data in randomised trials

8/10/2012. Education level and diabetes risk: The EPIC-InterAct study AIM. Background. Case-cohort design. Int J Epidemiol 2012 (in press)

Transcription:

Propensity Scores; Generalising Results from RCT s Robbie Peck, University of Bath June 5, 2016

The Idea Randomised Controlled Trials (RCT s) are the gold standard for estimating the effect of treatments on healthy patients under controlled conditions. Small scale Efficacy

The Idea Randomised Controlled Trials (RCT s) are the gold standard for estimating the effect of treatments on healthy patients under controlled conditions. Small scale Efficacy But will the treatment be effective in a particular target population? Effectiveness Clinical trial participants might not be representative of target population

The Idea Randomised Controlled Trials (RCT s) are the gold standard for estimating the effect of treatments on healthy patients under controlled conditions. Small scale Efficacy But will the treatment be effective in a particular target population? Effectiveness Clinical trial participants might not be representative of target population Generalisability of RCT results.

The Idea Definition The propensity score of a patient is the probability of taking part in the RCT given the covariates of the patient

The Idea Definition The propensity score of a patient is the probability of taking part in the RCT given the covariates of the patient Can be used to reweight the results of a RCT in order to estimate the effectiveness of a drug in a target population.

Problem Formulation Ω = the target population of interest.

Problem Formulation Ω = the target population of interest. X i = the covariates of i, i Ω.

Problem Formulation Ω = the target population of interest. X i = the covariates of i, i Ω. Φ Ω. A RCT is performed on Φ.

Problem Formulation Ω = the target population of interest. X i = the covariates of i, i Ω. Φ Ω. A RCT is performed on Φ. S i = indicator variable for patient i being in Φ.

Problem Formulation Ω = the target population of interest. X i = the covariates of i, i Ω. Φ Ω. A RCT is performed on Φ. S i = indicator variable for patient i being in Φ. T i = indicator variable that patient i is in the treatment group (as opposed to the control) of the RCT.

Problem Formulation Ω = the target population of interest. X i = the covariates of i, i Ω. Φ Ω. A RCT is performed on Φ. S i = indicator variable for patient i being in Φ. T i = indicator variable that patient i is in the treatment group (as opposed to the control) of the RCT. Y i (1) and Y i (0) = the treatment effects in the RCT under treatment and control.

Problem Formulation RCT average treatment effect: 1 Φ i {s i =1} Y i (1) Y i (0). Target Population average treatment effect: 1 Ω Ω i=1 Y i(1) Y i (0).

Problem Formulation RCT average treatment effect: 1 Φ i {s i =1} Y i (1) Y i (0). Target Population average treatment effect: 1 Ω Ω i=1 Y i(1) Y i (0). Assuming Φ is representative of the target population Ω, these should be equal.

Problem Formulation RCT average treatment effect: 1 Φ i {s i =1} Y i (1) Y i (0). Target Population average treatment effect: 1 Ω Ω i=1 Y i(1) Y i (0). Assuming Φ is representative of the target population Ω, these should be equal. They rarely are: due to tightly defined entry criteria.

Three Assumptions P( S i = 1 X i ) > 0 for any covariates X i. All patients in the target population have some probability of being used in the RCT.

Three Assumptions P( S i = 1 X i ) > 0 for any covariates X i. All patients in the target population have some probability of being used in the RCT. S [Y (0), Y (1)] X There are no unmeasured confounders (related to both trial sample selection and treatment effect).

Three Assumptions P( S i = 1 X i ) > 0 for any covariates X i. All patients in the target population have some probability of being used in the RCT. S [Y (0), Y (1)] X There are no unmeasured confounders (related to both trial sample selection and treatment effect). T [S, Y (0), Y (1)] X Treatment assignment in the RCT is randomly assigned and independent of sample selection and responses.

Estimating the propensity scores Definition The propensity score of patient i is p i = P(S i = 1 X i )

Estimating the propensity scores Definition The propensity score of patient i is p i = P(S i = 1 X i ) Given observed data {S i } i Φ and covariates {X } i Φ, estimate propensity scores by logistic regression: S i = Bern(p i ) log ( p i 1 p i ) = β 0 + β 1 X 1,i +... + β k X k,i

Estimating the propensity scores Definition The propensity score of patient i is p i = P(S i = 1 X i ) Given observed data {S i } i Φ and covariates {X } i Φ, estimate propensity scores by logistic regression: S i = Bern(p i ) log ( p i 1 p i ) = β 0 + β 1 X 1,i +... + β k X k,i Use these ˆp i s to see whether Φ is representative of Ω.

Assessing Generalisability Propensity score difference:= difference between the mean propensity scores of patients in the trial sample Ω and target population Φ: p = 1 Ω i {S i =1} ˆp i 1 Φ Ω i {S i =0} ˆp i

Assessing Generalisability Propensity score difference:= difference between the mean propensity scores of patients in the trial sample Ω and target population Φ: p = 1 Ω i {S i =1} ˆp i 1 Φ Ω i {S i =0} ˆp i If propensity score means differ by >0.25 sd s, then clinical trial population Φ may not be representative of Ω.

Assessing Generalisability To see if propensity score weighting works: Weight the RCT control group so the characteristics are like the target population, and then compare the responses.

Assessing Generalisability To see if propensity score weighting works: Weight the RCT control group so the characteristics are like the target population, and then compare the responses. Use propensity score matching :

Assessing Generalisability To see if propensity score weighting works: Weight the RCT control group so the characteristics are like the target population, and then compare the responses. Use propensity score matching : Inverse Probability of Treatment Weighting: Each individual is given a weight 1 ˆp i (X i ).

Assessing Generalisability To see if propensity score weighting works: Weight the RCT control group so the characteristics are like the target population, and then compare the responses. Use propensity score matching : Inverse Probability of Treatment Weighting: Each individual is given a weight 1 ˆp i (X i ). Under our 3 assumptions, the RCT control group under this weighting has responses which are an unbias estimate of those in the target population.

Assessing Generalisability IPoTW can be unstable/give very high propensity scores

Assessing Generalisability IPoTW can be unstable/give very high propensity scores Subclassification and Full Matching use coarser weights by grouping individuals with a similar propensity score.

Assessing Generalisability IPoTW can be unstable/give very high propensity scores Subclassification and Full Matching use coarser weights by grouping individuals with a similar propensity score. R package Matching

Estimating the Treatment Effect in the Target Population Assuming we are happy, use these propensity score matching methods on the treatment group instead.

Estimating the Treatment Effect in the Target Population Assuming we are happy, use these propensity score matching methods on the treatment group instead. As before, weight the patients in the treatment group of the RCT.

Estimating the Treatment Effect in the Target Population Assuming we are happy, use these propensity score matching methods on the treatment group instead. As before, weight the patients in the treatment group of the RCT. Compare treatment effects in this weighted population and the target population.

Estimating the Treatment Effect in the Target Population Assuming we are happy, use these propensity score matching methods on the treatment group instead. As before, weight the patients in the treatment group of the RCT. Compare treatment effects in this weighted population and the target population. Unbias estimate of the treatment effect in the target population.

Application: PBIS Study in Maryland USA PBIS is a school prevention program which aims to "improve the school climate through better systems and procedures".

Application: PBIS Study in Maryland USA PBIS is a school prevention program which aims to "improve the school climate through better systems and procedures". 37 Maryland schools took part in a RCT to investigate the effect of PBIS in 2002-2003. The 37 schools were randomised into control and treatment groups.

Application: PBIS Study in Maryland USA PBIS is a school prevention program which aims to "improve the school climate through better systems and procedures". 37 Maryland schools took part in a RCT to investigate the effect of PBIS in 2002-2003. The 37 schools were randomised into control and treatment groups. Define the target population as Maryland schools.

Application: PBIS Study in Maryland USA PBIS is a school prevention program which aims to "improve the school climate through better systems and procedures". 37 Maryland schools took part in a RCT to investigate the effect of PBIS in 2002-2003. The 37 schools were randomised into control and treatment groups. Define the target population as Maryland schools. We consider variables such as the characteristics of students and school measured in 2002 as covariates.

Application: PBIS Study in Maryland USA Main Point: The characteristics/covariates of the schools in the RCT are different to those in the target population.

Application: PBIS Study in Maryland USA Main Point: The characteristics/covariates of the schools in the RCT are different to those in the target population. Schools in the trial had lower test scores and higher free school meals than those in the target population.

Application: PBIS Study in Maryland USA Main Point: The characteristics/covariates of the schools in the RCT are different to those in the target population. Schools in the trial had lower test scores and higher free school meals than those in the target population. So... Propensity scores were estimated through logistic regression.

Application: PBIS Study in Maryland USA Figure: Propensity score density of schools in target population and propensity scores of 37 schools in RCT (vertical lines) Mean propensity score difference p = 0.055. So Φ not representative of Ω.

Application: PBIS Study in Maryland USA Figure: Observed and Predicted Maths outcomes for schools in Maryland. Thick Line: Target Population Scores. Dashed Line: Control Group RCT Scores. Thin Line: Propensity Score Weighted Control Group RCT Scores.

Application: PBIS Study in Maryland USA Figure: Observed and Predicted Maths outcomes for schools in Maryland. Thick Line: Target Population Scores. Dashed Line: Control Group RCT Scores. Thin Line: Propensity Score Weighted Control Group RCT Scores.

Application: PBIS Study in Maryland USA "Despite the differences seen between the trial and non-trial schools, it appears that the control schools in the trial reflect what was happening across the state as a whole, when weighted up to represent the population" - Stuart et al. (2010) [2]

Application: PBIS Study in Maryland USA "Despite the differences seen between the trial and non-trial schools, it appears that the control schools in the trial reflect what was happening across the state as a whole, when weighted up to represent the population" - Stuart et al. (2010) [2] Therefore if we weighted the RCT treatment group schools in the same way, we unbiasedly assess the effectiveness of PBIS in the target population of Maryland Schools.

Link to ITT4: Trials Within Cohorts (TwiCs) [1] Idea: Recruit a large observational cohort with a condition. Regularly measure responses. Perform repeated RCT s over time on subsets of the cohort.

Link to ITT4: Trials Within Cohorts (TwiCs) [1] Idea: Recruit a large observational cohort with a condition. Regularly measure responses. Perform repeated RCT s over time on subsets of the cohort. In each RCT: Randomly select some patients, who are offered the treatment. Compare responses with remaining patients.

Link to ITT4: Trials Within Cohorts (TwiCs) [1] Idea: Recruit a large observational cohort with a condition. Regularly measure responses. Perform repeated RCT s over time on subsets of the cohort. In each RCT: Randomly select some patients, who are offered the treatment. Compare responses with remaining patients. Tackles problems with pragmatic trial designs (e.g. recruitment). May require 2 levels of propensity scoring methodology.

Bibliography O Cathain A Nicholl J Relton C, Torgerson D. Rethinking pragmatic randomised controlled trials: introducing the cohort multiple randomised controlled trial design. BMJ, 340:c1066, 2010. Bradshaw C Leaf P Stuart E, Cole S. The use of propensity scores to assess the generalizability of results from randomized trials. J R Stat Soc Ser A Stat Soc, 2010.