OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data

Similar documents
Applying Hill's criteria as a framework for causal inference in observational data

Methodologies for CRNs: Can Statisticians, Epidemiologists, and Machine Learners Play in the Same Sand Box? Perspectives from OHDSI

OBSERVATIONAL MEDICAL OUTCOMES PARTNERSHIP

Reliable and reproducible effect size estimates at scale

Propensity score methods to adjust for confounding in assessing treatment effects: bias and precision

Using the NIH Collaboratory's and PCORnet's distributed data networks for clinical trials and observational research - A preview

OBSERVATIONAL MEDICAL OUTCOMES PARTNERSHIP

Epidemiologic Methods I & II Epidem 201AB Winter & Spring 2002

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Impact Evaluation Toolbox

Tuning Epidemiological Study Design Methods for Exploratory Data Analysis in Real World Data

Review of Veterinary Epidemiologic Research by Dohoo, Martin, and Stryhn

Propensity Score Analysis: Its rationale & potential for applied social/behavioral research. Bob Pruzek University at Albany

Epidemiology: Overview of Key Concepts and Study Design. Polly Marchbanks

Matched Cohort designs.

Methods to control for confounding - Introduction & Overview - Nicolle M Gatto 18 February 2015

Using negative control outcomes to identify biased study design: A self-controlled case series example. James Weaver 1,2.

Online Supplementary Material

Large scale analytics for electronic health records: Lessons from Observational Health Data Science and Informatics (OHDSI)

PubH 7405: REGRESSION ANALYSIS. Propensity Score

Propensity Score Methods for Causal Inference with the PSMATCH Procedure

Research in Real-World Settings: PCORI s Model for Comparative Clinical Effectiveness Research

Improved Transparency in Key Operational Decisions in Real World Evidence

Objective: To describe a new approach to neighborhood effects studies based on residential mobility and demonstrate this approach in the context of

Assessing the impact of unmeasured confounding: confounding functions for causal inference

CAN EFFECTIVENESS BE MEASURED OUTSIDE A CLINICAL TRIAL?

Design and Analysis Plan Quantitative Synthesis of Federally-Funded Teen Pregnancy Prevention Programs HHS Contract #HHSP I 5/2/2016

BIOSTATISTICAL METHODS AND RESEARCH DESIGNS. Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA

TRIPLL Webinar: Propensity score methods in chronic pain research

Observational Study Designs. Review. Today. Measures of disease occurrence. Cohort Studies

Recent developments for combining evidence within evidence streams: bias-adjusted meta-analysis

Controlling Bias & Confounding

Propensity Score Analysis Shenyang Guo, Ph.D.

Challenges of Observational and Retrospective Studies

OBSERVATIONAL MEDICAL OUTCOMES PARTNERSHIP

Improved control for confounding using propensity scores and instrumental variables?

CONSORT 2010 checklist of information to include when reporting a randomised trial*

A Practical Guide to Getting Started with Propensity Scores

16:35 17:20 Alexander Luedtke (Fred Hutchinson Cancer Research Center)

Evaluating Social Programs Course: Evaluation Glossary (Sources: 3ie and The World Bank)

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

School of Population and Public Health SPPH 503 Epidemiologic methods II January to April 2019

Combining machine learning and matching techniques to improve causal inference in program evaluation

Observational comparative effectiveness research using large databases

Speaker Notes: Qualitative Comparative Analysis (QCA) in Implementation Studies

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

BIOSTATISTICAL METHODS

DEPARTMENT OF EPIDEMIOLOGY 2. BASIC CONCEPTS

INTERNAL VALIDITY, BIAS AND CONFOUNDING

SAMPLING ERROI~ IN THE INTEGRATED sysrem FOR SURVEY ANALYSIS (ISSA)

Index. Springer International Publishing Switzerland 2017 T.J. Cleophas, A.H. Zwinderman, Modern Meta-Analysis, DOI /

Chapter 10. Considerations for Statistical Analysis

Finland and Sweden and UK GP-HOSP datasets

State-of-the-art Strategies for Addressing Selection Bias When Comparing Two or More Treatment Groups. Beth Ann Griffin Daniel McCaffrey

Supplementary Appendix

exp neoplasms/ OR cancer*.mp. OR neoplasm*.mp. OR 3,363,037 Publication type: observational studies (cohorts and case-controls)

Accelerating Patient-Centered Outcomes Research and Methodological Research

An Introduction to the CBS Health Cognitive Assessment

Methodology for Non-Randomized Clinical Trials: Propensity Score Analysis Dan Conroy, Ph.D., inventiv Health, Burlington, MA

Incorporating Clinical Information into the Label

Comparison And Application Of Methods To Address Confounding By Indication In Non- Randomized Clinical Studies

The journey toward Clinical Characterization. Patrick Ryan, PhD Janssen Research and Development Columbia University Medical Center

OHDSI: Drawing reproducible conclusions from observational clinical data

Today: Binomial response variable with an explanatory variable on an ordinal (rank) scale.

Propensity scores: what, why and why not?

Propensity scores and instrumental variables to control for confounding. ISPE mid-year meeting München, 2013 Rolf H.H. Groenwold, MD, PhD

Interpreting Prospective Studies

Detecting Anomalous Patterns of Care Using Health Insurance Claims

TRACER STUDIES ASSESSMENTS AND EVALUATIONS

Lecture 2. Key Concepts in Clinical Research

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

The journey toward Population-level Effect Estimation. Martijn Schuemie, PhD Janssen Research and Development

Tool to Assess Risk of Bias in Cohort Studies

Logic Models to Support Program Design, Implementation and Evaluation

Division of Pharmacoepidemiology And Pharmacoeconomics Technical Report Series

Post-market Authorized Generic Evaluation (PAGE)

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Confounding and Bias

Introduction to Applied Research in Economics

Personalized, Evidence-based, Outcome-driven Healthcare Empowered by IBM Cognitive Computing Technologies. Guotong Xie IBM Research - China

Discontinuation and restarting in patients on statin treatment: prospective open cohort study using a primary care database

Analysis A step in the research process that involves describing and then making inferences based on a set of data.

Integrating Effectiveness and Safety Outcomes in the Assessment of Treatments

Version No. 7 Date: July Please send comments or suggestions on this glossary to

Clinical Epidemiology for the uninitiated

Analysis methods for improved external validity

The ROBINS-I tool is reproduced from riskofbias.info with the permission of the authors. The tool should not be modified for use.

By: Mei-Jie Zhang, Ph.D.

Recent advances in non-experimental comparison group designs

Identifying Peer Influence Effects in Observational Social Network Data: An Evaluation of Propensity Score Methods

Live WebEx meeting agenda

Statistical questions for statistical methods

Methods of Reducing Bias in Time Series Designs: A Within Study Comparison

Methods for Addressing Selection Bias in Observational Studies

You must answer question 1.

Introduction to Program Evaluation

Systematic Reviews and Meta- Analysis in Kidney Transplantation

Comparative Accuracy of a Diagnostic Index Modeled Using (Optimized) Regression vs. Novometrics

Estimating Direct Effects of New HIV Prevention Methods. Focus: the MIRA Trial

Webinar 3 Systematic Literature Review: What you Need to Know

Transcription:

OHDSI Tutorial: Design and implementation of a comparative cohort study in observational healthcare data Faculty: Martijn Schuemie (Janssen Research and Development) Marc Suchard (UCLA) Patrick Ryan (Janssen Research and Development) David Madigan (Columbia)

Today s Agenda Time Statistical programmer track Study designer track 8:00am-8:30am Welcome, get settled, get laptops ready 8:30am-9:30am Presentation: Overview of the new-user cohort method design, large scale propensity scores and outcome models 9:30am-10:30am Exercise: Dissect a published cohort study (team of 4: 2 statistical programmers + 2 study designers) 10:30am-10:45am Break 10:45am-11:45am Presentation: Walkthrough of implementing a cohort study using OHDSI tools 11:45am-12:30pm Lunch, stay here Lunch, switch rooms 12:30pm-2:30pm Exercise: Deep dive into cohort study implementation using the Exercise: Deep dive into cohort study design, using ATLAS CohortMethod R package -learn the functions contained in the R package - Learn to create cohort definitions for treatment cohort, comparator group, and outcome - Implement the study execution process for a published cohort study -review study output and interpret - Review study design decisions required within CohortMethod R package results - Implement the study design process for a published cohort study 2:30pm-2:45pm Break, stay here Break, return to main room 2:45pm-4:30pm Exercise: Collaborate on the design and implementation of your own cohort study (statistical programmer + study designer pairs work together) 4:30pm-5:00pm Team progress reports and wrap up

Overview of the new-user cohort method design, large scale propensity scores and outcome models

OHDSI s mission To improve health, by empowering a community to collaboratively generate the evidence that promotes better health decisions and better care.

What evidence does OHDSI seek to generate from observational data? Clinical characterization Natural history: Who are the patients who have diabetes? Among those patients, who takes metformin? Quality improvement: what proportion of patients with diabetes experience disease-related complications? Population-level estimation Safety surveillance: Does metformin cause lactic acidosis? Comparative effectiveness: Does metformin cause lactic acidosis more than glyburide? Patient-level prediction Precision medicine: Given everything you know about me and my medical history, if I start taking metformin, what is the chance that I am going to have lactic acidosis in the next year? Disease interception: Given everything you know about me, what is the chance I will develop diabetes?

What is OHDSI s strategy to deliver Methodological research reliable evidence? Develop new approaches to observational data analysis Evaluate the performance of new and existing methods Establish empirically-based scientific best practices Open-source analytics development Design tools for data transformation and standardization Implement statistical methods for large-scale analytics Build interactive visualization for evidence exploration Clinical evidence generation Identify clinically-relevant questions that require real-world evidence Execute research studies by applying scientific best practices through open-source tools across the OHDSI international data network Promote open-science strategies for transparent study design and evidence dissemination

A standardized process for evidence generation and dissemination How OHDSI is trying to help: 1. Question OHDSI community 2. Review Open-source knowledgebase (LAERTES) 3. Design 4. Publish Protocol Open-source front-end web applications (ATLAS) 5. Execute 6. Evaluate Open-source back-end statistical packages (R Methods Library) OHDSI network studies 7. Synthesize

A pop culture mash-up to explain counterfactual reasoning

Counterfactual reasoning for one person Decision Baseline: Period to satisfy inclusion criteria 0 Person Time Follow-up: Period to observe outcomes

Counterfactual reasoning for a population Cohort summary Outcome summary

Alas, we don t have a Delorean What is our next best approximation? Instead of studying the same population under both decision options, let s define a larger population and randomly assign one treatment to each person, then compare outcomes between the two cohorts

Randomized treatment assignment to approximate counterfactual outcomes Assigned Assigned Assigned Assigned Assigned Assigned

Cohort summary Outcome summary Randomization allows for assumption that persons assigned to target cohort are exchangeable at baseline with persons assigned to comparator cohort

Alas, we can t randomize What is our next, next best approximation? Define a larger population, observe the treatment choices that were made, then compare outcomes: Between persons who made different choices (comparative cohort design) OR Within persons during time periods with different exposure status (self-controlled designs)

How does Epidemiology define a comparative cohort study? it depends on what Epidemiology textbook you read In a retrospective cohort study the investigator identified the cohort of individuals Cohort based studies on their are characteristics studies that identify in the past subsets and then of a defined reconstructs population and their subsequent follow them disease over experience time, looking up for to differences some defined their point outcome. in the most Cohort recent past studies or up generally to the present compare time exposed patients to unexposed patients, although they --Kelsey can also et al, be Methods used to compare in Observational one exposure Epidemiology, to another. 1996 In a cohort study, --Strom, a group Pharmacoepidemiology, of people (a cohort) 2005 is assembled, none of whom has experienced the outcome of interest, but all of whom could experience it On entry to the study, people in the cohort are classified according to those characteristics In the paradigmatic (possible risk cohort factors) study, that might the investigator be related defines to outcome. two or These more groups people of are people then observed that are free over of time disease to see and which that of differ them according experience to the the extent of In the outcome. their exposure to a potential cause of disease. These groups are referred to as cohort the study s most representative format, a defined population is identified. Its subjects --Fletcher, cohorts. are Fletcher When classified and two according Wagner, groups to Clincal are studies, exposure Epidemiology one is usually status, and the The though of as incidence Essentials, the of the 1996 exposed or index cohort those individuals who have experienced the putative disease causal (or any event other or condition health outcome and the of interest) other is then is thought of as the ascertained and unexposed compared or across reference exposure cohort. categories. --Szklo and Nieto, --Rothman, Epidemiology: Modern Beyond Epidemiology, the Basics, 20082007

An observational comparative cohort design to approximate counterfactual outcomes

Cohort summary Outcome summary Exchangeability assumption may be violated if there is reason for treatment choice...and there often is

Propensity score introduction e(x) = Pr(Z=1 x) Z is treatment assignment x is a set of all covariates at the time of treatment assignment Propensity score = probability of belonging to the target cohort vs. the comparator cohort, given the baseline covariates Propensity score can be used as a balancing score : if the two cohorts have similar propensity score distribution, then the distribution of covariates should be the similar (need to perform diagnostic to check) Rubin Biometrika 1983

Intuition around propensity score balance Schneeweiss. PDS 2011

Five reasons to use propensity score in Theoretical advantages pharmacoepidemiology Confounding by indication is the primary threat to validity, PS focuses directly on indications for use and non-use of drug under study Value of propensity scores for matching or trimming the population Eliminate uncomparable controls without assumptions of linear relationship between PS and outcome Improved estimation with few outcomes PS allows matching on one scalar value rather than needing degrees of freedom for all covariates Propensity score by treatment interactions PS enables exploration of patient-level heterogeneity in response Propensity score calibration to correct for measurement error Glynn et al, BCPT 2006

Methods for confounding adjustment using a propensity score Not generally recommended Fully implemented in OHDSI CohortMethod R package Garbe et al, Eur J Clin Pharmacol 2013, http://www.ncbi.nlm.nih.gov/pubmed/22763756

Matching as a strategy to adjust for baseline covariate imbalance Cohort summary

Stratification as a strategy to adjust for baseline covariate imbalance Strata 1: Large Cohort summary Strata 2: Medium Strata 3: Small Cohort summary

Conclusion: The new user cohort method can contribute useful information toward a risk identification system, but should not be considered definitive evidence given the degree of error observed within effect estimates. Careful consideration of the comparator selection and appropriate calibration of the effect estimates is required in order to properly interpret findings.

OHDSI s definition of cohort Cohort = a set of persons who satisfy one or more inclusion criteria for a duration of time Objective consequences based on this cohort definition: One person may belong to multiple cohorts One person may belong to the same cohort at multiple different time periods One person may not belong to the same cohort multiple times during the same period of time One cohort may have zero or more members A codeset is NOT a cohort logic for how to use the codeset in a criteria is required

Process flow for formally defining a cohort in ATLAS Cohort entry criteria Initial events Events are recorded time-stamped observations for the persons, such as drug exposures, conditions, procedures, measurements and visits. All events have a start date and end date, though some events may have a start date and end date with the same value (such as procedures or measurements). Initial event inclusion criteria Additional qualifying inclusion criteria The qualifying cohort will be defined as all persons who have an initial event, satisfy the initial event inclusion criteria, and fulfill all additional qualifying inclusion criteria. Each qualifying inclusion criteria will be evaluated to determine the impact of the criteria on the attrition of persons from the initial cohort. Cohort exit criteria Initial cohort Qualifying cohort

A database is full of cohorts, some of which may represent valid comparisons Database cohort New users of Drug A New users of Drug B Diagnosed with condition 2 New users of Drug C Users of C+F Incident event of condition 1 New users of Drug F Had Outcome 3 New users of Device D Persons with surgery E

What are the key inputs to a comparative cohort design? Input parameter Design choice Target cohort (T) Comparator cohort (C) Outcome cohort (O) Time-at-risk Model specification

Cohort restriction in comparative cohort analyses Initial target cohort T Initial comparator cohort C Qualifying target cohort Qualifying comparator cohort Analytic target Cohort (T ) Analytic comparator cohort (C ) Analytic outcome cohort: O in T, C during time-at-risk Outcome cohort

How the outcome cohort is used The choice of the outcome model defines your research question Logistic regression Binary classifier of presence/ absence of outcome during the fixed timeat-risk period Poisson regression Count the number of occurrences of outcomes during time-at-risk Cox proportional hazards Risk metric Odds ratio Rate ratio Hazard ratio Compute time-to-event from time-at-risk start until earliest of first occurrence of outcome or time-at-risk end, and track the censoring event (outcome or no outcome) Key model assumptions Constant probability in fixed window Outcomes follow Poisson distribution with constant risk Proportionality constant relative hazard

Design an observational study like you would a randomized trial Protocol components to emulate: Eligibility criteria Treatment strategies Assignment procedures Follow-up period Outcome Causal contrasts of interest Analysis plan

A standardized process for evidence generation and dissemination 1. Question 2. Review 3. Design 4. Publish Protocol 5. Execute 6. Evaluate 7. Synthesize

A standardized process for evidence generation and dissemination 1. Question 2. Review 3. Design 1. Question: What question is being asked? What s the motivation for asking the question? What decision is this evidence trying to inform? What are you trying to estimate: effect relative to counterfactual unexposed or effect relative to alternative treatment? 4. Publish Protocol 5. Execute 6. Evaluate 7. Synthesize

1. Question A standardized process for evidence generation and dissemination 2. Review 3. Design 4. Publish Protocol 2. Review: What evidence already exists about this question? What are the current evidence gaps? Based on this evidence, what is your current belief about the population-level effect? 5. Execute 6. Evaluate 7. Synthesize

1. Question 2. Review 3. Design A standardized process for evidence generation and dissemination 4. Publish Protocol 3. Design: Study team must make decisions about predefined inputs to standardized analytics: Target cohort Comparator cohort Outcome Time-at-risk Model specification Decisions require clinical domain knowledge, experience with the source observational data, and expertise in statistical modeling 5. Execute 6. Evaluate 7. Synthesize

1. Question A standardized process for evidence generation and dissemination 2. Review 3. Design 4. Publish Protocol 4. Publish Protocol: Protocol must provide a full specification of all design decisions to enable complete reproducibility Publishing protocol with all pre-defined decisions prior to execution ensures transparency 5. Execute 6. Evaluate 7. Synthesize

1. Question A standardized process for evidence generation and dissemination 2. Review 3. Design 4. Publish Protocol 5. Execute: Generate standardized output based on community best practices source code for transparency and reproducibility model diagnostics to evaluate accuracy aggregate summary statistics (no patient-level data) 5. Execute 6. Evaluate 7. Synthesize

1. Question A standardized process for evidence generation and dissemination 2. Review 3. Design 4. Publish Protocol 5. Execute 6. Evaluate 7. Synthesize 6. Evaluate: What is the systematic error for the method and data used in the analysis? (as could be estimated using negative controls) What is the coverage probability of the 95% confidence intervals? (as could be estimated by generated positive controls)

1. Question A standardized process for evidence generation and dissemination 2. Review 3. Design 4. Publish Protocol 5. Execute 6. Evaluate 7. Synthesize 7. Synthesize: How does the new evidence you ve generated compare with prior knowledge? What is your new belief about the effect, given your prior knowledge plus this new evidence?

A standardized process for evidence generation and dissemination 1. Question Dissemination via Publication 2. Review Background 3. Design Methods 4. Publish Protocol 5. Execute Results 6. Evaluate Results 7. Synthesize Discussion

When designing or reviewing a study, ask yourself: Input parameter Design choice Target cohort (T) Comparator cohort (C) Outcome cohort (O) Time-at-risk Model specification