Bayesian Nonparametric Methods for Precision Medicine

Similar documents
Individualized Treatment Effects Using a Non-parametric Bayesian Approach

EECS 433 Statistical Pattern Recognition

Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme

Meta-analysis of two studies in the presence of heterogeneity with applications in rare diseases

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Bayesian Joint Modelling of Benefit and Risk in Drug Development

A Brief Introduction to Bayesian Statistics

MS&E 226: Small Data

Bayesians methods in system identification: equivalences, differences, and misunderstandings

Fundamental Clinical Trial Design

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

Regression Discontinuity Designs: An Approach to Causal Inference Using Observational Data

Mathematical-Statistical Modeling to Inform the Design of HIV Treatment Strategies and Clinical Trials

Mathematical-Statistical Modeling to Inform the Design of HIV Treatment Strategies and Clinical Trials

Decision Making in Confirmatory Multipopulation Tailoring Trials

Selection of Linking Items

A Bayesian Measurement Model of Political Support for Endorsement Experiments, with Application to the Militant Groups in Pakistan

Propensity Score Methods for Estimating Causality in the Absence of Random Assignment: Applications for Child Care Policy Research

Coding and computation by neural ensembles in the primate retina

PKPD modelling to optimize dose-escalation trials in Oncology

Outline. Hierarchical Hidden Markov Models for HIV-Transmission Behavior Outcomes. Motivation. Why Hidden Markov Model? Why Hidden Markov Model?

Lecture 5: Sequential Multiple Assignment Randomized Trials (SMARTs) for DTR. Donglin Zeng, Department of Biostatistics, University of North Carolina

Bayesian methods in health economics

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

BayesOpt: Extensions and applications

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments

Biostatistical modelling in genomics for clinical cancer studies

A Case Study: Two-sample categorical data

Learning to Identify Irrelevant State Variables

Robust Nonparametric Inference for Stochastic Interventions Under Multi-Stage Sampling. Nima Hejazi

Introduction to Bayesian Analysis 1

Real-time computational attention model for dynamic scenes analysis

A Comparison of Methods for Determining HIV Viral Set Point

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Bayesian meta-analysis of Papanicolaou smear accuracy

Neurons and neural networks II. Hopfield network

Sample size calculation for a stepped wedge trial

MMSE Interference in Gaussian Channels 1

Bayesian Modeling of Multivariate Spatial Binary Data with Application to Dental Caries

Lecture II: Difference in Difference and Regression Discontinuity

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Joint Spatio-Temporal Modeling of Low Incidence Cancers Sharing Common Risk Factors

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Mathematical Modeling of PDGF-Driven Glioblastoma Reveals Optimized Radiation Dosing Schedules

Analysis of acgh data: statistical models and computational challenges

Bayesian Inference. Thomas Nichols. With thanks Lee Harrison

10-1 MMSE Estimation S. Lall, Stanford

Reinforcement Learning

Bayesian hierarchical modelling

Learning Utility for Behavior Acquisition and Intention Inference of Other Agent

Information Systems Mini-Monograph

Bayesian and Frequentist Approaches

Robust Optimization accounting for Uncertainties

Appendix Part A: Additional results supporting analysis appearing in the main article and path diagrams

Advanced IPD meta-analysis methods for observational studies

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.

How do we combine two treatment arm trials with multiple arms trials in IPD metaanalysis? An Illustration with College Drinking Interventions

Chapter 1. Introduction

Lecture 10: Learning Optimal Personalized Treatment Rules Under Risk Constraint

Introduction to Design and Analysis of SMARTs

WRITTEN PRELIMINARY Ph.D. EXAMINATION. Department of Applied Economics. January 17, Consumer Behavior and Household Economics.

Marcus Hutter Canberra, ACT, 0200, Australia

EXERCISE: HOW TO DO POWER CALCULATIONS IN OPTIMAL DESIGN SOFTWARE

Kelvin Chan Feb 10, 2015

Linking Errors in Trend Estimation in Large-Scale Surveys: A Case Study

Games With Incomplete Information: Bayesian Nash Equilibrium

Using Causal Inference to Estimate What-if Outcomes for Targeting Treatments

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Meta-analysis of few small studies in small populations and rare diseases

Design of a Community Randomized HIV Prevention Trial in Botswana

SUPPLEMENTARY MATERIAL. Impact of Vaccination on 14 High-Risk HPV type infections: A Mathematical Modelling Approach

Spatio-temporal modeling of weekly malaria incidence in children under 5 for early epidemic detection in Mozambique

Bayesian Dose Escalation Study Design with Consideration of Late Onset Toxicity. Li Liu, Glen Laird, Lei Gao Biostatistics Sanofi

Method Comparison for Interrater Reliability of an Image Processing Technique in Epilepsy Subjects

arxiv: v3 [stat.ap] 17 Apr 2018

Using mixture priors for robust inference: application in Bayesian dose escalation trials

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012

Detection of Unknown Confounders. by Bayesian Confirmatory Factor Analysis

Analysis A step in the research process that involves describing and then making inferences based on a set of data.

Analysis methods for improved external validity

An application of a pattern-mixture model with multiple imputation for the analysis of longitudinal trials with protocol deviations

UNIVERSITY OF CALIFORNIA SANTA CRUZ A STOCHASTIC DYNAMIC MODEL OF THE BEHAVIORAL ECOLOGY OF SOCIAL PLAY

Summary Report for HIV Random Clinical Trial Conducted in

Designing a Bayesian randomised controlled trial in osteosarcoma. How to incorporate historical data?

Protocol to Patient (P2P)

Using Bayesian Networks to Analyze Expression Data. Xu Siwei, s Muhammad Ali Faisal, s Tejal Joshi, s

Bayesian Methods in Regulatory Science

Bayesian Prediction Tree Models

A Drift Diffusion Model of Proactive and Reactive Control in a Context-Dependent Two-Alternative Forced Choice Task

Latest developments in WHO estimates of TB disease burden

Handling Partial Preferences in the Belief AHP Method: Application to Life Cycle Assessment

An Introduction to Bayesian Statistics

Historical controls in clinical trials: the meta-analytic predictive approach applied to over-dispersed count data

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Cancer survivorship and labor market attachments: Evidence from MEPS data

Transcription:

Bayesian Nonparametric Methods for Precision Medicine Brian Reich, NC State Collaborators: Qian Guan (NCSU), Eric Laber (NCSU) and Dipankar Bandyopadhyay (VCU) University of Illinois at Urbana-Champaign Department of Statistics January 31, 2019 Brian Reich - NC State Bayesian policy search 1 / 41

Qian Guan Brian Reich - NC State Bayesian policy search 2 / 41

Mia Hu Brian Reich - NC State Bayesian policy search 3 / 41

Personalized medicine Personalized medicine attempts to improve healthcare by optimally allocating treatment to population subgroups Common subpopulations are based on genetics, baseline health status, etc. In this talk we deal with cost constraints, i.e., restrictions on the number of treatments that can be applied We also want an interpretable policy, not a black box Brian Reich - NC State Bayesian policy search 4 / 41

Motivating dental example Classic recall recommendation is 6 months for all patients This is not evidence-based In clinical practice some patients are given different recall recommendations based on ad hoc rules or the dentists intuition This process has yet to be formally optimized Brian Reich - NC State Bayesian policy search 5 / 41

Motivating dental example We have observational data collected by Health Partners in suburban Minneapolis For each visit the response is the proportion of the measurement sites with either a missing tooth or unhealthy gums Covariates include age, race, gender, diabetic status, etc. We have data for 25,000 patients The average number of visits is 8.5 Brian Reich - NC State Bayesian policy search 6 / 41

Example response trajectories Brian Reich - NC State Bayesian policy search 7 / 41

Observed versus recommended recall Brian Reich - NC State Bayesian policy search 8 / 41

Challenges posed by this problem Many time points Non-compliance Cost constraints Complicated data structure with mass at zero and autocorrelation Brian Reich - NC State Bayesian policy search 9 / 41

A second motivating example: Malaria in the DRC Malaria affects hundreds of millions of people Effective treatments such as bed nets are available However, resources are limited We build a recommendation engine to determine the optimal allocation of bed nets across DRC Challenge: Optimal allocation must account for spillover effects of treatment to other regions Brian Reich - NC State Bayesian policy search 10 / 41

Estimated Malaria prevalence in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 11 / 41

Bed net distribution in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 12 / 41

Back to the dental example: Data for patient i History up to visit t: H it = {X i, Y i0, A i1, δ i1, Y i1,..., A iti, δ iti, Y iti } Baseline covariates: X i Baseline response: Y i0 Recommended time until next visit (action) t: A it Time between visits t 1 and t: δ it We control A it, everything else is random Brian Reich - NC State Bayesian policy search 13 / 41

Definitions A policy π is a deterministic function that maps the available data to a recommendation, A it = π(h it ; α) where α are unknown parameters We consider only policies determined by a risk score: R it = α 0 + J g j (H it )α j j=1 where the g j are clinically-meaningful features The action is then a function of the risk, e.g., { 3 if R it 0 A it = π(h it ; α) = 9 if R it < 0 Brian Reich - NC State Bayesian policy search 14 / 41

Comparing policies The problem now reduces to finding the optimal α Reward: measure of success for patient i, e.g., the improvement in the response over the next 5 years Value: V (α) is the expected (averaging over X i, δ it, and Y it ) reward if actions are given using α Cost: C(α) is the expected time between visits if actions are given using α Optimal policy: the α that maximizes V (α) subject to a constraint on C(α) Brian Reich - NC State Bayesian policy search 15 / 41

Estimating the policy Let f be the systems dynamics model A complete f must specify stochastic models for: X i ; Y i0 X i ; δ i1 X i, Y i0, A i1,... Given f, we can approximate V (α) and C(α) Some methods attempt to estimate the policy without estimating f, e.g., Q-learning and A-learning These are hard to put in the Bayesian framework Brian Reich - NC State Bayesian policy search 16 / 41

G-computation Our approach is to estimate f using Bayesian nonparametrics (BNP) This allows us to incorporate prior information and do uncertainty quantification Of course if the model is wrong, estimates of V (α) and C(α) will be poor, and we may not find the best α We try to avoid misspecification problems using a flexible BNP model We view our method as an extension of Xu et al (JASA, 2016), who propose a BNP/G-computation method for a three-stage trial Brian Reich - NC State Bayesian policy search 17 / 41

Our Dirichlet process mixture model Let Θ i = {θ i0, θ i1, θ i2 } be a random effect for subject i Baseline: (X T i, Y i0 ) T Normal(θ i0, Σ 0 ) Compliance: δ it A it, X i, Y it 1 Normal(X T it θ i1, σ 2 1 ) Progression: Y it Y it 1 δ it, A it, X i, Y it 1 Normal(Z T it θ i2, σ 2 2 ) X and Z are user-specified functions of the history H it θ i iid f where f has a Dirichlet process mixture (DPM) prior Brian Reich - NC State Bayesian policy search 18 / 41

Estimating the policy via α Given f, the optimal α is determined but difficult to compute α opt = arg max α V (α) s.t. C(α) < c For a given f we can approximate V (α) and C(α) using Monte Carlo simulation These can be noisy approximations, making optimization challenging We use simulation/optimization methods for stochastic functions Brian Reich - NC State Bayesian policy search 19 / 41

Estimating the policy via α We first estimate V and C on a course grid of candidate α We then smooth these values with Gaussian process regression To refine the solution we using sequential optimization, selecting the next candidate α to maximum expected gain This process takes around 40 minutes for the simulated examples Brian Reich - NC State Bayesian policy search 20 / 41

Estimating the policy via α We can also use this process to obtain the posterior distribution of the policy parameters α The optimal α is a function of f We repeat the optimization for several posterior samples of f, giving the posterior for the optimal α This can be used for uncertainty quantification/testing Brian Reich - NC State Bayesian policy search 21 / 41

Simulation set-up Data generated with n subjects each followed for five years The dynamics f are either MVN (single) or a mixture of two MNV (mixture) Risk score: R it = α 0 + X i1 α 1 + X i2 α 2 + log( δ it 1 A it 1 )α 3 + Y it 1 α 4 Actions, A it = { 3 if R it 0 9 if R it < 0 The reward function is 1 n i ni t=1 Y iti(y it > 0) Brian Reich - NC State Bayesian policy search 22 / 41

Simulation set-up - Competing models Baseline: A it = 6 for all i and t Gaussian: Policy search assuming normality for Θ i DPM: Policy search assuming the full BNP/DPM model Oracle: f is known perfectly Brian Reich - NC State Bayesian policy search 23 / 41

Simulation study - Value (smaller is better) Cluster n Baseline Gaussian DPM Oracle Single 1000 0.68 0.10 0.10 0.10 Single 5000 0.67 0.10 0.10 0.10 Mixture 1000 1.09 0.80 0.68 0.68 Mixture 5000 1.09 0.81 0.68 0.68 Brian Reich - NC State Bayesian policy search 24 / 41

Simulation study - Cost (constrained to be 6) Cluster n Baseline Gaussian DPM Oracle Single 1000 6.00 6.00 6.00 5.99 Single 5000 6.00 6.00 6.00 6.00 Mixture 1000 6.00 6.07 6.00 6.01 Mixture 5000 6.00 6.07 6.00 5.99 Brian Reich - NC State Bayesian policy search 25 / 41

Scenario 1: Estimated versus true α (a) Single cluster, n=1000 α opt 1.0 0.5 0.0 0.5 1.0 X1 X2 Comp Prev Y Feature Gaussian DPM Oracle Brian Reich - NC State Bayesian policy search 26 / 41

Scenario 3: Estimated versus true α (c) Mixture, n=1000 α opt 1.0 0.5 0.0 0.5 1.0 X1 X2 Comp Prev Y Feature Brian Reich - NC State Bayesian policy search 27 / 41

Coverage for α posterior samples 0 5 10 15 20 25 posterior samples 0 5 10 15 20 25 0.1 0.2 0.3 0.4 0.5 X1 0.0 0.1 0.2 0.3 0.4 0.5 X2 posterior samples 0 5 10 15 20 25 posterior samples 0 5 10 15 20 25 0.4 0.3 0.2 0.1 0.0 Comp 0.7 0.8 0.9 1.0 1.1 Prev Y Brian Reich - NC State Bayesian policy search 28 / 41

Real-data analysis - fitted priority score The utility function is the reduction in proportion of unhealthy sites in 5 years from baseline Recommendation is to return in 3 months if R t > 0, and 9 months otherwise The estimated optimal risk score is R t = 1.06 0.17 Std Age + 0.50 Diabetes + 0.22 log( δ t 1 A t 1 + 1) + 8.2 Y t 1 Young, unhealthy, diabetics that do not comply are recommended to return in 3 months Brian Reich - NC State Bayesian policy search 29 / 41

Value of competing policies The value of this policy is 0.0102 (0.0002) The value if all patients have A = 6 is 0.0170 (0.0002) This is an improvement of 40% This is a substantial improvement, especially when the improvement of expected value is multiplied by the number of people in the population Of course, this comes with caveats Brian Reich - NC State Bayesian policy search 30 / 41

Example 2: Estimated Malaria prevalence in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 31 / 41

Example 2: Bed net distribution in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 32 / 41

Example 2: Definitions Data: Y jt is the logit of the prevalence in zone j in year t for 2000-2015 (Bhatt et al) Action: A jt is the proportion of homes given bed nets Reward: Spatiotemporal average malaria prevalence over the next five years Cost: Average A jt (over j) less than a fixed threshold each year Problem: Estimate the optimal policy for allocating A jt each year Brian Reich - NC State Bayesian policy search 33 / 41

Local priority score Our method assigns each health zone a priority score The priority score is a linear combination of the zone s climate current malaria prevalence prevalence in the neighboring zones The priority score is denoted P i = α 1 temp i +α 2 precip i +α 3 prevalence i +α 4 neigbor-prev i and is a function of unknown weights α Brian Reich - NC State Bayesian policy search 34 / 41

Global utility function We select the proportion of individuals to receive bed nets in all n zones, A = (A 1,..., A n ), to maximize n exp(p i )A i + α 0 w ij (A i A j ) 2 i=1 where w ij is a weight assigned to pair of zones The first term encourages zones with high priority scores to receive more bed nets The second term either encourages or discourages clustering of A j depending on the sign of α 0 This can be solved with quadratic programming i<j Brian Reich - NC State Bayesian policy search 35 / 41

Optimization The key task is to optimize over α 0,..., α 4 to minimize the long-run malaria prevalence in DRC As with the dental example, this requires extensive simulation from a fitted model We fit a Gaussian linear model with covariates Previous time s prevalence Prevalence in neighboring zones Temperature and precipitation Bed-net allocation Interactions and spatially correlated errors Brian Reich - NC State Bayesian policy search 36 / 41

Fitted priority score The optimal priority score weights (after transforming all variables to a common scale) are: Spatial clustering: α 0 = 0.13 Temperature: α 1 = 2.6 Precipitaion: α 2 = 0.8 Current prevalence: α 3 = 1.2 Prevalence of neighboring zones: α 4 = 3.1 These assume the average A jt stays at the current average Brian Reich - NC State Bayesian policy search 37 / 41

Value of competing policies The projected average prevalence over the next five years is 0.136 following our policy 0.140 if all resources are allocated to the zones with highest current prevalence 0.149 if all zones are given the same resources Brian Reich - NC State Bayesian policy search 38 / 41

Estimated prevalence in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 39 / 41

Projected prevalence in 2016 following our policy 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 40 / 41

Summary We proposed methods for optimizing dental-recall 1 and malaria-control 2 recommendation engines Our methods handle non-compliance and non-normality, and permit uncertainty quantification The methods are designed to produce interpretable results, but implementation would still be complicated Our methods would surely benefit from causal analysis Work supported by NIH, NSF and the Gates Foundation 1 In revision, JASA A&CS 2 Submitted to Biometrics Brian Reich - NC State Bayesian policy search 41 / 41