Joseph W Hogan Brown University & AMPATH February 16, 2010

Similar documents
BIOSTATISTICAL METHODS

PubH 7405: REGRESSION ANALYSIS. Propensity Score

Stratified Tables. Example: Effect of seat belt use on accident fatality

Confounding. Confounding and effect modification. Example (after Rothman, 1998) Beer and Rectal Ca. Confounding (after Rothman, 1998)

Measures of Association

Strategies for data analysis: case-control studies

ADVANCED STATISTICAL METHODS: PART 1: INTRODUCTION TO PROPENSITY SCORES IN STATA. Learning objectives:

m 11 m.1 > m 12 m.2 risk for smokers risk for nonsmokers

First of two parts Joseph Hogan Brown University and AMPATH

Controlling Bias & Confounding

Propensity Score Methods to Adjust for Bias in Observational Data SAS HEALTH USERS GROUP APRIL 6, 2018

Statistical Reasoning in Public Health Biostatistics 612, 2009, HW#3

115 remained abstinent. 140 remained abstinent. Relapsed Remained abstinent Total

Confounding and Bias

Finland and Sweden and UK GP-HOSP datasets

Introduction. Lecture 1. What is Statistics?

Confounding and Interaction

Overview of Clinical Study Design Laura Lee Johnson, Ph.D. Statistician National Center for Complementary and Alternative Medicine US National

Main objective of Epidemiology. Statistical Inference. Statistical Inference: Example. Statistical Inference: Example

Logistic Regression Predicting the Chances of Coronary Heart Disease. Multivariate Solutions

Introduction to Observational Studies. Jane Pinelis

Methods to control for confounding - Introduction & Overview - Nicolle M Gatto 18 February 2015

BEST PRACTICES FOR IMPLEMENTATION AND ANALYSIS OF PAIN SCALE PATIENT REPORTED OUTCOMES IN CLINICAL TRIALS

Biostatistics II

Age (continuous) Gender (0=Male, 1=Female) SES (1=Low, 2=Medium, 3=High) Prior Victimization (0= Not Victimized, 1=Victimized)

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Donna L. Coffman Joint Prevention Methodology Seminar

Is Hospital Admission Useful for Syncope Patients? Preliminary Results of a Multicenter Cohort

Big Data Challenges & Opportunities. L. Miriam Dickinson, PhD

University of Wollongong. Research Online. Australian Health Services Research Institute

STATISTICS IN CLINICAL AND TRANSLATIONAL RESEARCH

The article by Stamou and colleagues [1] found that

Supplementary Appendix

Matt Laidler, MPH, MA Acute and Communicable Disease Program Oregon Health Authority. SOSUG, April 17, 2014

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision

Survey of Smoking, Drinking and Drug Use (SDD) among young people in England, Andrew Bryant

Strategies for Data Analysis: Cohort and Case-control Studies

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Immortal Time Bias and Issues in Critical Care. Dave Glidden May 24, 2007

General Biostatistics Concepts

Confounding, Effect modification, and Stratification

A Comparison of Methods of Analysis to Control for Confounding in a Cohort Study of a Dietary Intervention

Great Expectations: Changing Mode of Survey Data Collection in Military Populations

TITLE: A Data-Driven Approach to Patient Risk Stratification for Acute Respiratory Distress Syndrome (ARDS)

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

Case-control studies. Hans Wolff. Service d épidémiologie clinique Département de médecine communautaire. WHO- Postgraduate course 2007 CC studies

CLINICAL BIOSTATISTICS

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Simple Sensitivity Analyses for Matched Samples Thomas E. Love, Ph.D. ASA Course Atlanta Georgia

Online Supplementary Material

BMI may underestimate the socioeconomic gradient in true obesity

What to Expect Today. Example Study: Statin Letter Intervention. ! Review biostatistic principles. ! Hands on application

Instrumental Variables I (cont.)

Confounding Bias: Stratification

Session 7: The Sliding Dichotomy 7.1 Background 7.2 Principles 7.3 Hypothetical example 7.4 Implementation 7.5 Example: CRASH Trial

Matched Cohort designs.

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

STA 3024 Spring 2013 EXAM 3 Test Form Code A UF ID #

Peter C. Austin Institute for Clinical Evaluative Sciences and University of Toronto

Part 8 Logistic Regression

Methods for Addressing Selection Bias in Observational Studies

Searching for flu. Detecting influenza epidemics using search engine query data. Monday, February 18, 13

Perspectives on analysing subgroup effects of clinical trials and their meta analyses

Using Ensemble-Based Methods for Directly Estimating Causal Effects: An Investigation of Tree-Based G-Computation

Using Electronic Health Records Data for Predictive and Causal Inference About the HIV Care Cascade

Section 6.1 Sampling. Population each element (or person) from the set of observations that can be made (entire group)

Epidemiology: Overview of Key Concepts and Study Design. Polly Marchbanks

STATISTICS 8 CHAPTERS 1 TO 6, SAMPLE MULTIPLE CHOICE QUESTIONS

Ph.D. Comprehensive Examination

Since 1980, obesity has more than doubled worldwide, and in 2008 over 1.5 billion adults aged 20 years were overweight.

Impact of Nonresponse on Survey Estimates of Physical Fitness & Sleep Quality. LinChiat Chang, Ph.D. ESRA 2015 Conference Reykjavik, Iceland

Rise of the Machines

Using Propensity Score Matching in Clinical Investigations: A Discussion and Illustration

Person-years; number of study participants (number of cases) HR (95% CI) P for trend

What is a case control study? Tarani Chandola Social Statistics University of Manchester

ECON Introductory Econometrics Seminar 7

Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002

The SAS SUBTYPE Macro

sickness, disease, [toxicity] Hard to quantify

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies

Examining Relationships Least-squares regression. Sections 2.3

RESEARCH. Predicting risk of osteoporotic fracture in men and women in England and Wales: prospective derivation and validation of QFractureScores

PhD Course in Biostatistics

Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models, 2nd Ed.

Package speff2trial. February 20, 2015

SWITCH Trial. A Sequential Multiple Adaptive Randomization Trial

AnExaminationoftheQualityand UtilityofInterviewerEstimatesof HouseholdCharacteristicsinthe NationalSurveyofFamilyGrowth. BradyWest

INTERNAL VALIDITY, BIAS AND CONFOUNDING

The Australian longitudinal study on male health sampling design and survey weighting: implications for analysis and interpretation of clustered data

Biostatistics for Med Students. Lecture 1

BIOSTATISTICAL METHODS AND RESEARCH DESIGNS. Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

CALERIE Phase 2: Adherence Algorithm and Calculation of RQ and Adherence

Supplementary Methods

Detection of Differential Test Functioning (DTF) and Differential Item Functioning (DIF) in MCCQE Part II Using Logistic Models

A study of adverse reaction algorithms in a drug surveillance program

Estimating average treatment effects from observational data using teffects

State of Iowa Outcomes Monitoring System

Transcription:

Joseph W Hogan Brown University & AMPATH February 16, 2010

Drinking and lung cancer Gender bias and graduate admissions AMPATH nutrition study

Stratification and regression drinking and lung cancer graduate admissions Matching Related to stratification Nutrition study Weighting

Lung Cancer? Yes No Heavy drinker 33 1667 (1.9%) Non drinker 27 2273 (1.2%) Odds ratio = 1.67

Smokers CA No CA HD 24 776 (3%) ND 6 194 (3%) OR = 1.0

Smokers Non- Smokers CA No CA CA No CA HD 24 776 (3%) ND 6 194 (3%) OR = 1.0 HD 9 891 (1%) ND 21 2079 (1%) OR = 1.0

Of the 1000 smokers: 800 are heavy drinkers (80%) 30 develop lung cancer (3%)

Of the 1000 smokers: 800 are heavy drinkers (80%) 30 develop lung cancer (3%) Of the 3000 non- smokers 900 are heavy drinkers (30%) 30 develop lung cancer (1%) Source: Rosner, Fundamentals of Biostatistics, Duxbury Press, 1995

Method 1: Mantel- Haenszel odds ratio Stratify the analysis on the confounding variable Take weighted average of odds ratios Here, the weighted average is 1.0

Method 2: Logistic regression Dependent variable = lung CA Independent variable = drinking (yes/no) Confounder variable = smoking (yes/no) Having the confounder in the model performs the adjustment In large samples, equivalent to M- H odds ratio

(1) Variable Coef. s.e. O.R. Drinker 0.51 0.26 1.67 (2) Variable Coef. s.e. O.R. Drinker 0.00 0.30 1.00 Smoker 1.12 0.30 3.06

Coefficients in logistic regression are log odds ratios Adding a yes/no variable as a predictor stratifies the analysis

Observational study of sex bias in graduate admissions (1973) Admission rates: Women 4,321 applied, 35% admitted Men 8,442 applied, 44% admitted A clear case of discrimination?

Men Women Major Applied % Admitted Applied % Admitted A 825 62 108 82

Men Women Major Applied % Admitted Applied % Admitted A 825 62 108 82 B 560 63 25 68

Men Women Major Applied % Admitted Applied % Admitted A 825 62 108 82 B 560 63 25 68 C 325 37 593 34

Men Women Major Applied % Admitted Applied % Admitted A 825 62 108 82 B 560 63 25 68 C 325 37 593 34 D 417 33 375 35

Men Women Major Applied % Admitted Applied % Admitted A 825 62 108 82 B 560 63 25 68 C 325 37 593 34 D 417 33 375 35 E 191 28 393 24

Men Women Major Applied % Admitted Applied % Admitted A 825 62 108 82 B 560 63 25 68 C 325 37 593 34 D 417 33 375 35 E 191 28 393 24 F 373 6 341 7

Men Women Major Applied % Admitted Applied % Admitted A 825 62 108 82 B 560 63 25 68 C 325 37 593 34 D 417 33 375 35 E 191 28 393 24 F 373 6 341 7

First two majors are the easiest (over 50% of men applied to these) The rest are harder (over 90% of women applied to these) This time, major department is the confounder

As with odds ratios, stratify and average Take a weighted average of sex- specific admission rates across majors The weight is the total number of applicants to the department

Unweighted (aggregated) rates Men = 44% Women = 35% Weighted (within- department) rates Men = 39% Women = 43% Within department, women have better admission rates.

Study objectives Assess effect of food assistance for those initiating cart Weight, clinic adherence, mortality Have data on those in food program Need comparable control group to assess program effectiveness

Food program 1864 identified on food assistance 74% female Mean age 37 yrs Mean wt 52 kg How to identify a control group? Cannot just get randomly- selected controls

Idea: match each treated person to one or more untreated controls Want to match on one or more characteristics Result: those characteristics are controlled Analysis is similar to stratification methods (though technically more complicated)

Simple example: match these lists on age Group 1 (Treated) 20 22 24 24 32 50 58 60 Group 2 (Controls) 18 20 26 27 40 41 60 61

Find closest possible match for each treated Group 1 (Treated) 20 22 24 24 32 50 58 60 Group 2 (Controls) 18 20 26 27 40 41 60 61

Match within 2 year window 20 22 24 24 32 50 58 60 18 20 26 27 40 41 60 61

Find closest possible match for each treated 20 22 24 24 32 50 58 60 18 20 26 27 40 41 60 61

Find closest possible match for each treated 20 22 24 24 32 50 58 60 18 20 26 27 40 41 60 61

Find closest possible match for each treated 20 22 24 24 32 50 58 60 18 20 26 27 40 41 60 61

Simple example illustrates that algorithms are needed to get optimal matching Example: minimize total difference in age over all possible matched sets Can specify thresholds for matching Can match on more than one variable

Information comes from discordancy in the outcome, within matched sets Matched sets where outcome is the same contribute no information Some special analysis routines are needed Conditional logistic regression Stratified regression

When matching is done effectively, results can have more power than unmatched analyses But not always the case

case male basewt basecd4 adh3! -------------------------------------! 0 0 56 37 0! 0 0 50 27 1! 0 0 54 47 1! 1 0 55 39 0!

case male basewt basecd4 adh3! -------------------------------------! 0 0 68 167 1! 0 0 66 126 1! 0 0 72 133 1! 0 0 72 162 0! 1 0 71 117 1!

case male basewt basecd4 adh3! -------------------------------------! 0 0 65 740 1! 0 0 70 545 1! 1 0 67 801 1!

Method: conditional logistic regression Outcome = adherence (yes/no) Independent var = case status (food yes/no) Do not need to add matching covariates Cannot estimate effect of variables used to match Odds ratio interpretation: Effect of food program within matched sets i.e. within sets having similar covariate profile

note: 1052 groups (2338 obs) dropped because of all positive or all negative outcomes. Number of obs = 3821 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - adh3 OR SE [95% Conf. Interval] - - - - - - - - - - - +- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - case 1.31.097 1.13 1.51 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Ideal solution: randomized trial Another solution: covariate balance Stratified analyses Matching Can only reduce confounding bias to the extent that you can measure and adjust for important confounders

Stratification Stratify the population across levels of the confounder Works well when confounders are low dimensional Matching Match individual treated to untreated

Confounders should be thought of in advance For observational studies, confounder adjustment is essential Matching or stratification?

Use stratification/regression adjustment for Small to moderate sized studies Situations with small number of confounders Use matching for Large studies, where lots of controls are available Situations where you are not interested in the effect of the confounders themselves

Weighting and propensity scores Close relation to missing data To be discussed at next lecture