Matched Cohort designs.

Similar documents
TRIPLL Webinar: Propensity score methods in chronic pain research

Introduction to Observational Studies. Jane Pinelis

Propensity Score Methods to Adjust for Bias in Observational Data SAS HEALTH USERS GROUP APRIL 6, 2018

Improved control for confounding using propensity scores and instrumental variables?

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

State-of-the-art Strategies for Addressing Selection Bias When Comparing Two or More Treatment Groups. Beth Ann Griffin Daniel McCaffrey

Finland and Sweden and UK GP-HOSP datasets

BIOSTATISTICAL METHODS

Assessing the impact of unmeasured confounding: confounding functions for causal inference

Statin therapy in patients with Mild to Moderate Coronary Stenosis by 64-slice Multidetector Coronary Computed Tomography

Instrumental Variables I (cont.)

PubH 7405: REGRESSION ANALYSIS. Propensity Score

Supplementary Appendix

Rise of the Machines

Sensitivity Analysis in Observational Research: Introducing the E-value

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Selected Topics in Biostatistics Seminar Series. Missing Data. Sponsored by: Center For Clinical Investigation and Cleveland CTSC

ESM1 for Glucose, blood pressure and cholesterol levels and their relationships to clinical outcomes in type 2 diabetes: a retrospective cohort study

Race Original cohort Clean cohort HR 95%CI P HR 95%CI P. <8.5 White Black

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Supplement materials:

OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data

Supplementary Appendix

A Practical Guide to Getting Started with Propensity Scores

CHL 5225 H Advanced Statistical Methods for Clinical Trials. CHL 5225 H The Language of Clinical Trials

Optimal full matching for survival outcomes: a method that merits more widespread use

Supplementary Online Content

CAN EFFECTIVENESS BE MEASURED OUTSIDE A CLINICAL TRIAL?

Matching methods for causal inference: A review and a look forward

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Online Supplementary Material

Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods

MEDCODE READCODE READTERM

CENTRE FOR STATISTICAL METHODOLOGY Celebrating the first International Statistics Prize Winner, Prof Sir David Cox

RWD after RCT. evaluating drugs in daily clinical practice. Stefan Franzén PhD. Helsinki

Analysis of TB prevalence surveys

INTERNAL VALIDITY, BIAS AND CONFOUNDING

Preventive Cardiology Scientific evidence

Rates and patterns of participation in cardiac rehabilitation in Victoria

LEADER Liraglutide and cardiovascular outcomes in type 2 diabetes

Implications of The LookAHEAD Trial: Is Weight Loss Beneficial for Patients with Diabetes?

Recent advances in non-experimental comparison group designs

We define a simple difference-in-differences (DD) estimator for. the treatment effect of Hospital Compare (HC) from the

Observational & Quasi-experimental Research Methods

115 remained abstinent. 140 remained abstinent. Relapsed Remained abstinent Total

Matt Laidler, MPH, MA Acute and Communicable Disease Program Oregon Health Authority. SOSUG, April 17, 2014

Should we prescribe aspirin and statins to all subjects over 65? (Or even all over 55?) Terje R.Pedersen Oslo University Hospital Oslo, Norway

Observational comparative effectiveness research using large databases

Estimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little

Is it ever too late for cardiovascular prevention and rehabilitation? Prof. Dr. Helmut Gohlke Herz-Zentrum Bad Krozingen, Germany

Dylan Small Department of Statistics, Wharton School, University of Pennsylvania. Based on joint work with Paul Rosenbaum

Manitoba Centre for Health Policy. Inverse Propensity Score Weights or IPTWs

Using Propensity Score Matching in Clinical Investigations: A Discussion and Illustration

Joseph W Hogan Brown University & AMPATH February 16, 2010

Supplementary Online Content

INTRODUCTION TO SURVIVAL CURVES

Supplementary Table S1: Proportion of missing values presents in the original dataset

An Introduction to Propensity Score Methods for Reducing the Effects of Confounding in Observational Studies

What have We Learned in Dyslipidemia Management Since the Publication of the 2013 ACC/AHA Guideline?

Introducing a SAS macro for doubly robust estimation

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

Outline. The why, when, and how of propensity score methods for estimating causal effects. Course description. Outline.

Epidemiologic study designs

Long-Term Use of Long-Acting Insulin Analogs and Breast Cancer Incidence in Women with Type 2 Diabetes.

Proton Pump Inhibitors increase Cardiovascular risk in patient taking Clopidogrel

1. How does the response to therapy compare in elderly versus middle age adults with diabetes in a randomized trial?

John J.P. Kastelein MD PhD Professor of Medicine Dept. of Vascular Medicine Academic Medial Center / University of Amsterdam

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Time-varying confounding and marginal structural model

Confounding by indication developments in matching, and instrumental variable methods. Richard Grieve London School of Hygiene and Tropical Medicine

Propensity score analysis with the latest SAS/STAT procedures PSMATCH and CAUSALTRT

By: Mei-Jie Zhang, Ph.D.

Beyond Controlling for Confounding: Design Strategies to Avoid Selection Bias and Improve Efficiency in Observational Studies

Confounding and Bias

Is it worth offering cardiovascular disease prevention to the elderly? Prof. Dr. Helmut Gohlke Herz-Zentrum Bad Krozingen, Germany

Reflection Questions for Math 58B

PharmaSUG Paper HA-04 Two Roads Diverged in a Narrow Dataset...When Coarsened Exact Matching is More Appropriate than Propensity Score Matching

ECON Microeconomics III

Immortal Time Bias and Issues in Critical Care. Dave Glidden May 24, 2007

Coronary Artery Bypass Grafting in Diabetics: All Arterial or Hybrid?

Contents. Part 1 Introduction. Part 2 Cross-Sectional Selection Bias Adjustment

The role of Randomized Controlled Trials

SUPPLEMENTAL MATERIAL. Supplemental Methods. Duke CAD Index

Peter C. Austin Institute for Clinical Evaluative Sciences and University of Toronto

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

Analysis methods for improved external validity

Experimental Design. Terminology. Chusak Okascharoen, MD, PhD September 19 th, Experimental study Clinical trial Randomized controlled trial

8/10/2012. Education level and diabetes risk: The EPIC-InterAct study AIM. Background. Case-cohort design. Int J Epidemiol 2012 (in press)

CVD risk assessment using risk scores in primary and secondary prevention

Regulatory Experience in Reviewing CV Safety for Diabetes

Methods to control for confounding - Introduction & Overview - Nicolle M Gatto 18 February 2015

1. Albuminuria an early sign of glomerular damage and renal disease. albuminuria

Osler Journal Club Outcomes Research

Supplemental Table S2: Subgroup analysis for IL-6 with BMI in 3 groups

Supplementary material

Patient characteristics Intervention Comparison Length of followup

Estimating Direct Effects of New HIV Prevention Methods. Focus: the MIRA Trial

The Latest Generation of Clinical

Supplement 2. Use of Directed Acyclic Graphs (DAGs)

Transcription:

Matched Cohort designs. Stefan Franzén PhD Lund 2016 10 13

Registercentrum Västra Götaland Purpose: improved health care 25+ 30+ 70+ Registries Employed Papers Statistics IT Project management Registry development Communication

Agenda Why match? The estimand Selecting controls Matching options Checking balance Post matching analysis

Why match 1. Reduce confounding 2. Set an index date for the controls in longitudinal data 3. Reduce variability and model dependence

The estimand ATE The average treatment effect for everyone ATT The average treatment effect for the treated

The estimand ATT is weighted by the treated population (red) ATE is weighted by the total population (red and green combined ) An RCT estimated ATE. Why?

The estimand, example Compare mortality after PCI and CABG post MI ATE: ATT PCI What is the effect of treating everyone with PCI vs everyone with CABG? What is the effect of treating the PCI patients with CABG? ATT CABG What is the effect of treating the CABG patients with PCI?

Selecting controls What is the exposure? What is the alternative to exposure? One specific alternative exposure e.g. PCI or CABG Unexposed standard care e.g. not yet exposed to a particular drug What would have happend if the patient was not exposed? Potential outcomes:

Selecting controls counterfactuals What would have happend if the patient was not exposed? :observed :missing Wecanonlyobserveoneofthe potential outcomes The potential outcome under control remains missing for a treated patient Matching is away to find that missing outcome

Selecting controls, example 1 PCI or CABG 1: Determine the estimand: ATT CABG 2: For each CABG patient, find a matching PCI patient CABG is only an option for 3 or 4 vessel disease and these patients may have been treaed with PCI before for 1 or 2 vessel disease 3: A potential control is a patien treated with PCI for 3 or 4 vessel disease

Selecting controls, example 2 Insulin pump or injections For each exposed, select without replacement the closest unexposed patient from the risk set, who are alive at and who have the latest registration no earlier than days Calendar time Registration in the NDR Index date for pump Death Selected control The selected control is subsequently followed from with respect to outcomes and may be censored if exposed to pump at a later date

Matching options Attribute matching Mahalanobis distance Propensity score Disease risk score Combinations

Attribute matching Create matched (pairs) such that i.e. both individuals have the same values on all matched variables. Creates exact balance Curse of dimentionality Continuous variables has to be discretized

Mahalanobis distance matching Idea: Create a one dimensional summary measure Σ Find patients such that implying Works best for small number of continuous variables Leon & Carriére 2005 Considers all interaction terms equallyimportant Claimed to be less sensitive to model dependence (in the analysis) than propensity score. (King & Nielsen 2016)

Propensity score matching Definition: 1 where 0,1 indicates treatment Point: 0, 1 0, 1 If the treatment isn t confounded by then it isn t confounded by. But there may be variables in that are weekly related to and these will not be so similar within a matched pair i.e. given.

Disease risk score matching For a new treatment we may not have enough treated patients to reliably estimate a propensity score Idea: use a disease risk score as a balancing score If Ψ then Ψ is a prognostic score 1. Estimate a model for the outcome based on the non exposed 2. Create predictions (scores) for all patients 3. Create a matched analysis data using the scores Hansen 2007

Combinations of matching measures Attribute matching on a few important variables Matching on summary measure on the rest Handle remaining confounding by regression modelling Mathing on propensity and disease risk score combined 1. Estimate propensity score 2. Estimate disease risk score 3. Match on Mahalanobis distance of both

Ways to match (and alternatives) Method Effect Match 1 1 ATT Reduced sample size. Potential loss of treated patients Match 1 n ATT Reduced sample size. Potential loss of treated patients Match m n ATT Reduced sample size. Potential loss of treated patients Full matching ATE Complicated matching process IPT Weighting ATE or ATT Some observations may carry a very large weight Stratify ATE (ATT) Simple watch out for very unbalanced strata Regression ATE May hide an extrapolation Assumes a specific functional from Uses the outcome in the model May not be possible to include all potential confounders

Greedy 1-k matching 1. Select a treated patient at random 2. Select the closest k ( often k=1) untreated patient Aim: to estimate ATT Number of controls: 2 is better than 1 (law of diminishing return) Calipers: Reject matches where Helps to improve balance Rubin & Thomas: δ 0.25 or 0.5 Prunes the data by dropping treated and no longer estimates ATT Replacements: Match with or without replacement Replacements may improve the balance but the observations are no longer independent which mess up the analysis

Why greedy 1-k matching estimates ATT Starting with the treated preserves the distribution for the treated but only if we don t loose and treated

Alternatives to greedy 1-k matching Greedy m k matching Simlilar to 1 k but takes m treated at time. Estimates ATT Full matching Find small sets of exposed and controls minimizing some global distance measure The number of exposed and controls in each set varies Computer intensive iterative process May mess up the analysis Estimates ATE

Creating balance, an iterative process Estimate the propensity score Match or apply IPT weights Not OK Check the balance OK Run the analysis Since the propensity score model doesn t contain the outcome you can fiddle around with the model until you are satisfies with the balance you get

Evaluating the balance Unfortunately it is (still) common practice to compare groups at index using a hypothesis test

Evaluating the balance Balance between groups is a property of the data and not of the underlying true population averages

The p value and the sample size The p value is heavily affected by the number of observations Using the p value as balance metric we can create a balance between groups by randomly deleting observations

The standardized difference The standardized difference is not influenced by the number of observations and is a much better balance metric Some authors have suggested that 10% or 20% indicates sufficient balance, but this is as arbitrary as the famous 5%

Other balance metrix The prognostic risk score Hansen 2007: If Ψ then Ψ is a prognostic score Stuart, Lee & Leacy 2013: Model E based on unexposed, eg glm Predict outcomes for all patients Evaluate the balance based on the predictions

What if I can t match? Failing to find matching controls for all treated is a property of the data, not of the method

Post matching analysis Convensional wisdom: Analyze as designed Attribute matching: Include matching variables in the analysis Propensity scores: Greedy 1 k: doesn t influence the analysis Full matching n m: Clusters has to be accounted for in the analysis

Post matching analysis Fit the analysis model just as if matching hadn t happaned ANOVA: keep variabiliy from the error term Glm inc Cox: Avoid hidden covaraite bias Combinations of proc scores and regression is not doubly robust Doubly robust estimator i i i i i i i i i i i i i i i i dr e X m e T T Y n e X m e T YT n ˆ 1 ˆ, ˆ 1 1 ˆ ˆ, ˆ 1 ˆ 0 1

Example 1: Gastric Bypass Analysis strategy: match as closely as the data permits and fit a Cox regression to the matched data accounting for remaining confounding 6177 GBP patients are matched 1 1 to not yet treated patients from NDR selected from 440824 patients with 4884442 records Matched on: Year: 2008 2009, 2010 2011, 2011 2012, 2013 2014 BMI: [28 35),[35, 39), [39, 44), [44, ) Age: [18,42), [42, 49)[49,56), [56,, ) Sex: Male, Female 128 strata Matched controls are selected from the patients at risk in the time interval Matched controls are removed from the risk sets in following time intervals (no replacement) Matched controls are censored when (if) treated (n=733 control patients are censored this way) If mutiple registrations are availabe in for a selected control, one is selected randomly as index

Example 1: Gastric Bypass Table 1. Descriptive statistics (mean(sd) or count(%)) of soreg and matched NDR control patients after matching Variable Gastric bypass (n=6132) Matched control (n=6132) p-value standardized difference Sex 2364 (38.6%) 2364 (38.6%) 1 0 Age (years) 48.5 (9.8) 50.5 (12.7) 8.212E-24 0.1835435 BMI 42.0 (5.7) 41.4 (5.7) 9.039E-10 0.1107137 HbA1c(%) 7.6 (1.6) 7.6 (1.5) 0.5767475 0.0100873 LDL 2.8 (1.0) 2.8 (0.9) 0.0706582 0.0326673 HDL 1.1 (0.5) 1.2 (0.3) 2.7347E-8 0.1019714 SBP 140.0 (17.2) 133.8 (15.6) 5.48E-93 0.3730043 DBP 83.3 (10.2) 79.8 (10.2) 3.571E-80 0.3449707 Previous MI 231 (3.8%) 261 (4.3%) 0.1674394 0.1270678 Previous CHF 172 (2.8%) 254 (4.1%) 0.0000526 0.3993627 Previous stroke 131 (2.1%) 179 (2.9%) 0.0057565 0.3179059 Smoking 540 (8.8%) 1049 (17.1%) 1.784E-11 0.7474336 Type 2 diabetes 4968 (94.8%) 5524 (91.5%) 5.193E-11 0.5208766 BP medication 3644 (59.4%) 4083 (66.6%) 2.195E-16 0.3088453 Lipid lowering 2142 (34.9%) 3076 (50.2%) 3.025E-65 0.6382504 medication Diabetes medication 4975 (81.1%) 5110 (83.3%) 0.0014266 0.1508076 Duration of diabetes 7.6 (7.1) 7.7 (7.8) 0.4546706 0.0149411 Married 2914 (47.5%) 2548 (41.6%) 2.937E-11 0.2425066 Yearly income (Ksek) 202.5 (126.7) 183.5 (124.4) 5.359E-17 0.1515408 Education (Low) 1253 (20.4%) 1776 (29.0%) 6.579E-28 0.4631218 Education (Mid) 3663 (59.7%) 3256 (53.1%) 1.247E-13 0.2711535 Education (High) 1216 (19.8%) 1100 (17.9%) 0.0074437 0.1235661

Example 1: Gastric Bypass Cox proportional hazards regression for mortality, cardiovascular death and myocardial infarction for patients treated with gastric bypass Endpoint Hazard ratio 95% Confidence interval p value Mortality 0.46 [0.35, 0.62] <.0001 CV mortality 0.38 [0.18, 0.79] 0.0097 Myocardial infarction 0.49 [0.29, 0.84] 0.0094 The analysis is based on a Cox proportional hazards regression including the treatment and age, sex, BMI, hba1c, ldl, hdl, SBP, DBP, smoking, previous MI, previous CHF, previous stroke, bloodpressure med, lipid med, diabetes med, income, education and marrietal status. The analysis data was matched on age, sex, BMI and calendar time period

Example Statins and type 1 diabetics 0.00 0.25 0.50 0.75 1.00 Statin treatment Propensity score Treatment No Yes DISTRIBUTION OF PROPENSITY SCORES (A) Overall Cohort 0 2 4 6 0.00 0.25 0.50 0.75 1.00 Propensity score Density Treatment No statin Statin DENSITY PLOT OF PROPENSITY SCORES (C) Overall Cohort 0.00 0.25 0.50 0.75 Statin treatment Propensity score Treatment No Yes (B) Matched cohort 0.0 0.5 1.0 1.5 0.00 0.25 0.50 0.75 1.00 Propensity score Density Treatment No statin Statin (D) Matched cohort Propensity scores Greedy 1 1 matching 4025 matched treated 1362 excluded treated What are we estimating? Superb balance! endoint HR 95% CI P value Total death 0.74 [0.62, 0.88] 0.001 CV death 0.83 [0.66, 1.03] 0.086 Hero et.al Diabetes Care 2016

Indeed what are we estimating Aimed to estimate ATT Density 3 2 Excluded from matched cohort Included in matched cohort Overall (n=24,230) Not all statin patients included Doesn t quite estimate ATT 1 0 0.00 0.25 0.50 0.75 1.00 Propensity score The use of statins has become so regulated that we can no longer properly evaluate it. OK if regulations are based on evidence

Softwares R packages: MatchIt, twang SAS: code and macros arround but no official proc Stata: psmatch2, pscore, and more Propensity score: Easy to do with standard components Matching: greedy 1 1 easy but beyond that it gets harder Software http://www.biostat.jhsph.edu/~estuart/propensityscoresoftware.html

Summary 1. Use the right estimand: ATT or ATE 2. Balance is everything 3. Try to avoid p values 4. Usually ok to analyze as usual after matching

Backups

Alternatives to matching Regression adjustment Modell: Y ij x i ij ij Y=μ B +βx μ B Y=μ A +βx μ A Stratification x=covariate 1 2 3 4 5 No data on A d=0.58 d=0.52 d=0.19 No data on B x=covariate

Causation We are interested in investigating the extent β of the causal effect of exposure E on the outcome Y β E Causal effect Y

Confounding A confounder is a factor C distorts the relationships between exposure E and outcome Y. If C is not observed we are in trouble. E β Causal effect Y C

Confounding In reality the relation between the exposure and the outcome is often very complex and partially unobservable U Unobserved confounder exposure E Reverse causality Causal effect outcome Y M mediator V instrument confounder C U risk factor

What to adjust for Confounders Directed acyclic graph Mediators* Risk factors Instruments** Not ok in propensity score models *) Consider Causal Mediation analysis **) Leads to instrumental variable methods that can handle unobserved confounders, but good instruments are VERY hard to find

Propensity scores with time dependent variables Fit a Cox regression model with time updated covariates: exp For each exposed, select without replacement the closest unexposed patient from the risk set, who are alive at and who have the latest registration no earlier than days before using as a distance measure Calendar time Registration in the NDR Index date for pump Death Selected control The data from the index date for a pump user is regarded as being prior to treatment and the actual index date for the exposure is set to the date after this registration The selected control is subsequently followed from with respect to outcomes and may be censored if exposed to pump at a later date

Propensity scores with time dependent variables Propensity score: constant over time often modelled using logistic regression In the longitudinal world, consider time to exposure Consider the probability of being exposed at time given non exposed up to time,, or even λ, lim Why not use Coxregression: λ, λ? Match on

Matching on attributes vs matching on PS Exact attribute matching: Virtually eliminates model dependence Curse of dimentionality Continuous variables has to be discretized Mahanobis distance matching: Reduces model dependence Works best for continuous variables Propensity score matching: 0, 1 Only gives balance on average, not between matched patients Model dependence

Headline here