Bayesian Nonparametric Methods for Precision Medicine

Bayesian Nonparametric Methods for Precision Medicine Brian Reich, NC State Collaborators: Qian Guan (NCSU), Eric Laber (NCSU) and Dipankar Bandyopadhyay (VCU) University of Illinois at Urbana-Champaign Department of Statistics January 31, 2019 Brian Reich - NC State Bayesian policy search 1 / 41

Qian Guan Brian Reich - NC State Bayesian policy search 2 / 41

Mia Hu Brian Reich - NC State Bayesian policy search 3 / 41

Personalized medicine Personalized medicine attempts to improve healthcare by optimally allocating treatment to population subgroups Common subpopulations are based on genetics, baseline health status, etc. In this talk we deal with cost constraints, i.e., restrictions on the number of treatments that can be applied We also want an interpretable policy, not a black box Brian Reich - NC State Bayesian policy search 4 / 41

Motivating dental example Classic recall recommendation is 6 months for all patients This is not evidence-based In clinical practice some patients are given different recall recommendations based on ad hoc rules or the dentists intuition This process has yet to be formally optimized Brian Reich - NC State Bayesian policy search 5 / 41

Motivating dental example We have observational data collected by Health Partners in suburban Minneapolis For each visit the response is the proportion of the measurement sites with either a missing tooth or unhealthy gums Covariates include age, race, gender, diabetic status, etc. We have data for 25,000 patients The average number of visits is 8.5 Brian Reich - NC State Bayesian policy search 6 / 41

Example response trajectories Brian Reich - NC State Bayesian policy search 7 / 41

Observed versus recommended recall Brian Reich - NC State Bayesian policy search 8 / 41

Challenges posed by this problem Many time points Non-compliance Cost constraints Complicated data structure with mass at zero and autocorrelation Brian Reich - NC State Bayesian policy search 9 / 41

A second motivating example: Malaria in the DRC Malaria affects hundreds of millions of people Effective treatments such as bed nets are available However, resources are limited We build a recommendation engine to determine the optimal allocation of bed nets across DRC Challenge: Optimal allocation must account for spillover effects of treatment to other regions Brian Reich - NC State Bayesian policy search 10 / 41

Estimated Malaria prevalence in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 11 / 41

Bed net distribution in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 12 / 41

Back to the dental example: Data for patient i History up to visit t: H it = {X i, Y i0, A i1, δ i1, Y i1,..., A iti, δ iti, Y iti } Baseline covariates: X i Baseline response: Y i0 Recommended time until next visit (action) t: A it Time between visits t 1 and t: δ it We control A it, everything else is random Brian Reich - NC State Bayesian policy search 13 / 41

Definitions A policy π is a deterministic function that maps the available data to a recommendation, A it = π(h it ; α) where α are unknown parameters We consider only policies determined by a risk score: R it = α 0 + J g j (H it )α j j=1 where the g j are clinically-meaningful features The action is then a function of the risk, e.g., { 3 if R it 0 A it = π(h it ; α) = 9 if R it < 0 Brian Reich - NC State Bayesian policy search 14 / 41

Comparing policies The problem now reduces to finding the optimal α Reward: measure of success for patient i, e.g., the improvement in the response over the next 5 years Value: V (α) is the expected (averaging over X i, δ it, and Y it ) reward if actions are given using α Cost: C(α) is the expected time between visits if actions are given using α Optimal policy: the α that maximizes V (α) subject to a constraint on C(α) Brian Reich - NC State Bayesian policy search 15 / 41

Estimating the policy Let f be the systems dynamics model A complete f must specify stochastic models for: X i ; Y i0 X i ; δ i1 X i, Y i0, A i1,... Given f, we can approximate V (α) and C(α) Some methods attempt to estimate the policy without estimating f, e.g., Q-learning and A-learning These are hard to put in the Bayesian framework Brian Reich - NC State Bayesian policy search 16 / 41

G-computation Our approach is to estimate f using Bayesian nonparametrics (BNP) This allows us to incorporate prior information and do uncertainty quantification Of course if the model is wrong, estimates of V (α) and C(α) will be poor, and we may not find the best α We try to avoid misspecification problems using a flexible BNP model We view our method as an extension of Xu et al (JASA, 2016), who propose a BNP/G-computation method for a three-stage trial Brian Reich - NC State Bayesian policy search 17 / 41

Our Dirichlet process mixture model Let Θ i = {θ i0, θ i1, θ i2 } be a random effect for subject i Baseline: (X T i, Y i0 ) T Normal(θ i0, Σ 0 ) Compliance: δ it A it, X i, Y it 1 Normal(X T it θ i1, σ 2 1 ) Progression: Y it Y it 1 δ it, A it, X i, Y it 1 Normal(Z T it θ i2, σ 2 2 ) X and Z are user-specified functions of the history H it θ i iid f where f has a Dirichlet process mixture (DPM) prior Brian Reich - NC State Bayesian policy search 18 / 41

Estimating the policy via α Given f, the optimal α is determined but difficult to compute α opt = arg max α V (α) s.t. C(α) < c For a given f we can approximate V (α) and C(α) using Monte Carlo simulation These can be noisy approximations, making optimization challenging We use simulation/optimization methods for stochastic functions Brian Reich - NC State Bayesian policy search 19 / 41

Estimating the policy via α We first estimate V and C on a course grid of candidate α We then smooth these values with Gaussian process regression To refine the solution we using sequential optimization, selecting the next candidate α to maximum expected gain This process takes around 40 minutes for the simulated examples Brian Reich - NC State Bayesian policy search 20 / 41

Estimating the policy via α We can also use this process to obtain the posterior distribution of the policy parameters α The optimal α is a function of f We repeat the optimization for several posterior samples of f, giving the posterior for the optimal α This can be used for uncertainty quantification/testing Brian Reich - NC State Bayesian policy search 21 / 41

Simulation set-up Data generated with n subjects each followed for five years The dynamics f are either MVN (single) or a mixture of two MNV (mixture) Risk score: R it = α 0 + X i1 α 1 + X i2 α 2 + log( δ it 1 A it 1 )α 3 + Y it 1 α 4 Actions, A it = { 3 if R it 0 9 if R it < 0 The reward function is 1 n i ni t=1 Y iti(y it > 0) Brian Reich - NC State Bayesian policy search 22 / 41

Simulation set-up - Competing models Baseline: A it = 6 for all i and t Gaussian: Policy search assuming normality for Θ i DPM: Policy search assuming the full BNP/DPM model Oracle: f is known perfectly Brian Reich - NC State Bayesian policy search 23 / 41

Simulation study - Value (smaller is better) Cluster n Baseline Gaussian DPM Oracle Single 1000 0.68 0.10 0.10 0.10 Single 5000 0.67 0.10 0.10 0.10 Mixture 1000 1.09 0.80 0.68 0.68 Mixture 5000 1.09 0.81 0.68 0.68 Brian Reich - NC State Bayesian policy search 24 / 41

Simulation study - Cost (constrained to be 6) Cluster n Baseline Gaussian DPM Oracle Single 1000 6.00 6.00 6.00 5.99 Single 5000 6.00 6.00 6.00 6.00 Mixture 1000 6.00 6.07 6.00 6.01 Mixture 5000 6.00 6.07 6.00 5.99 Brian Reich - NC State Bayesian policy search 25 / 41

Scenario 1: Estimated versus true α (a) Single cluster, n=1000 α opt 1.0 0.5 0.0 0.5 1.0 X1 X2 Comp Prev Y Feature Gaussian DPM Oracle Brian Reich - NC State Bayesian policy search 26 / 41

Scenario 3: Estimated versus true α (c) Mixture, n=1000 α opt 1.0 0.5 0.0 0.5 1.0 X1 X2 Comp Prev Y Feature Brian Reich - NC State Bayesian policy search 27 / 41

Coverage for α posterior samples 0 5 10 15 20 25 posterior samples 0 5 10 15 20 25 0.1 0.2 0.3 0.4 0.5 X1 0.0 0.1 0.2 0.3 0.4 0.5 X2 posterior samples 0 5 10 15 20 25 posterior samples 0 5 10 15 20 25 0.4 0.3 0.2 0.1 0.0 Comp 0.7 0.8 0.9 1.0 1.1 Prev Y Brian Reich - NC State Bayesian policy search 28 / 41

Real-data analysis - fitted priority score The utility function is the reduction in proportion of unhealthy sites in 5 years from baseline Recommendation is to return in 3 months if R t > 0, and 9 months otherwise The estimated optimal risk score is R t = 1.06 0.17 Std Age + 0.50 Diabetes + 0.22 log( δ t 1 A t 1 + 1) + 8.2 Y t 1 Young, unhealthy, diabetics that do not comply are recommended to return in 3 months Brian Reich - NC State Bayesian policy search 29 / 41

Value of competing policies The value of this policy is 0.0102 (0.0002) The value if all patients have A = 6 is 0.0170 (0.0002) This is an improvement of 40% This is a substantial improvement, especially when the improvement of expected value is multiplied by the number of people in the population Of course, this comes with caveats Brian Reich - NC State Bayesian policy search 30 / 41

Example 2: Estimated Malaria prevalence in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 31 / 41

Example 2: Bed net distribution in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 32 / 41

Example 2: Definitions Data: Y jt is the logit of the prevalence in zone j in year t for 2000-2015 (Bhatt et al) Action: A jt is the proportion of homes given bed nets Reward: Spatiotemporal average malaria prevalence over the next five years Cost: Average A jt (over j) less than a fixed threshold each year Problem: Estimate the optimal policy for allocating A jt each year Brian Reich - NC State Bayesian policy search 33 / 41

Local priority score Our method assigns each health zone a priority score The priority score is a linear combination of the zone s climate current malaria prevalence prevalence in the neighboring zones The priority score is denoted P i = α 1 temp i +α 2 precip i +α 3 prevalence i +α 4 neigbor-prev i and is a function of unknown weights α Brian Reich - NC State Bayesian policy search 34 / 41

Global utility function We select the proportion of individuals to receive bed nets in all n zones, A = (A 1,..., A n ), to maximize n exp(p i )A i + α 0 w ij (A i A j ) 2 i=1 where w ij is a weight assigned to pair of zones The first term encourages zones with high priority scores to receive more bed nets The second term either encourages or discourages clustering of A j depending on the sign of α 0 This can be solved with quadratic programming i<j Brian Reich - NC State Bayesian policy search 35 / 41

Optimization The key task is to optimize over α 0,..., α 4 to minimize the long-run malaria prevalence in DRC As with the dental example, this requires extensive simulation from a fitted model We fit a Gaussian linear model with covariates Previous time s prevalence Prevalence in neighboring zones Temperature and precipitation Bed-net allocation Interactions and spatially correlated errors Brian Reich - NC State Bayesian policy search 36 / 41

Fitted priority score The optimal priority score weights (after transforming all variables to a common scale) are: Spatial clustering: α 0 = 0.13 Temperature: α 1 = 2.6 Precipitaion: α 2 = 0.8 Current prevalence: α 3 = 1.2 Prevalence of neighboring zones: α 4 = 3.1 These assume the average A jt stays at the current average Brian Reich - NC State Bayesian policy search 37 / 41

Value of competing policies The projected average prevalence over the next five years is 0.136 following our policy 0.140 if all resources are allocated to the zones with highest current prevalence 0.149 if all zones are given the same resources Brian Reich - NC State Bayesian policy search 38 / 41

Estimated prevalence in 2015 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 39 / 41

Projected prevalence in 2016 following our policy 1.0 0.8 0.6 0.4 0.2 0.0 Brian Reich - NC State Bayesian policy search 40 / 41

Summary We proposed methods for optimizing dental-recall 1 and malaria-control 2 recommendation engines Our methods handle non-compliance and non-normality, and permit uncertainty quantification The methods are designed to produce interpretable results, but implementation would still be complicated Our methods would surely benefit from causal analysis Work supported by NIH, NSF and the Gates Foundation 1 In revision, JASA A&CS 2 Submitted to Biometrics Brian Reich - NC State Bayesian policy search 41 / 41