ABSTRACT. Prostate cancer is the most common solid tumor that affects American men.

Size: px
Start display at page:

Download "ABSTRACT. Prostate cancer is the most common solid tumor that affects American men."

Transcription

1 ABSTRACT ZHANG, JINGYU. Partially Observable Markov Decision Processes for Prostate Cancer Screening. (Under the direction of Dr. Brian T. Denton). Prostate cancer is the most common solid tumor that affects American men. Screening is carried out using prostate-specific antigen (PSA) tests and biopsies. This dissertation investigates the optimal design of screening policies that tradeoff the cost and harm to patients of screening with the benefits of early detection. We report on partially observable Markov decision processes (POMDPs) for the study of prostate cancer screening decisions. A Markov process represents the occurrence and progression of prostate cancer in our models. The core states are the patients prostate cancer related health states. PSA test results and biopsy results are the observations. First, a POMDP model is proposed for prostate biopsy referral decisions assuming the patient undergoes annual PSA screening. The objective is to maximize expected quality adjusted life years (QALYs). Several structural properties which give insights into the optimal biopsy referral policy over the course of a patient s lifetime are proved. An age-specific prostate biopsy referral policy is obtained and sensitivity analysis is used to show how the optimal policy and value are sensitive to parameters in the model. Next, a POMDP model is proposed for optimizing both PSA screening and biopsy referral decisions. We use this model to compute the optimal policy for PSA testing or biopsy at each decision epoch over the course of a patient s lifetime. The objective of the model is to maximize the difference between rewards for QALYs and the cost of screening, biopsy, and treatment. The optimal policy is compared to no screening

2 and the traditional guideline from the published literature. Benefits of screening are shown in terms of the expected QALYs and costs, and sensitivity analysis is performed with respect to cost parameters. Finally, a multi-stage POMDP is proposed to consider the coordination of prostate cancer screening and treatment decisions. Multiple treatment options, including active surveillance and radical prostatectomy, are considered in the model. The model is extended to include additional actions, core states, and observations at each decision epoch. A new sampling-based approximation method is developed to solve the extended POMDP model. Structural properties of the model are discussed and a method to take advantage of the underlying structure is incorporated into the approximation method. Computational experiments are presented which compare the new approximation method to other previously proposed methods to show the effectiveness and efficiency of our proposed approximation method. Empirical results for the optimal screening and treatment policy are presented. Sensitivity analysis is used to show how the availability of active surveillance (AS) influences the optimal screening policy and the expected QALYs.

3 Partially Observable Markov Decision Processes for Prostate Cancer Screening by Jingyu Zhang A dissertation submitted to the Graduate Faculty of North Carolina State University in partial fullfillment of the requirements for the Degree of Doctor of Philosophy Operations Research Raleigh, North Carolina 2011 APPROVED BY: Shu-Cherng Fang Julie S. Ivy Brian T. Denton Chair of Advisory Committee Thom J. Hodgson

4 ii DEDICATION To my family.

5 iii BIOGRAPHY Jingyu Zhang was born in Leshan, Sichuan Province, China in After he finished his junior middle school in Leshan, Sichuan, he attended the Experimental Class in Sciences sponsored by the Chinese Department of Education in the High School Attached to Tsinghua University, Beijing, China in He received his Bachelor of Science degree majoring in Mathematics and Physics from the Fundamental Science Class, School of Sciences, Tsinghua University, Beijing, China in He received his Masters in Operations Research at North Carolina State University in After his Ph.D final defense he joined Philips Research North America as a Member Research Staff in April 2011.

6 iv ACKNOWLEDGMENTS This thesis would not have been possible without the invaluable support and guidance of my advisor Dr. Brian T. Denton, who always believed in me and encouraged me to succeed during this challenging process. I am very thankful to him for his endless support. I would also like to acknowledge support for this research which was funded in part by grant CMMI from the National Science Foundation. I would like to thank my committee members, Dr. Thom J. Hodgson, Dr. Shu- Cherng Fang, and Dr. Julie S. Ivy for agreeing to be in my committee and providing feedback on my work. I would like to thank my collaborators, Dr. Brant A. Inman, Dr. Hari Balasubramanian and Dr. Nilay D. Shah for their efforts and suggestions. I would also like to thank Daniel Underwood for his technical assistance. I thank my father, my mother, and all my family and friends for their unconditional love and support. Finally, I thank my girlfriend Chuan Tian, for everything she did for me and for her endless love.

7 v TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES vii x 1 Introduction A Literature Review Introduction General POMDP Model POMDP Applications in Medical Decision Making Structural Properties of POMDPs Computational Methods Exact Algorithms Approximation Algorithms Contributions to the Literature Optimization of Prostate Biopsy Referral Decisions Introduction Prostate Cancer Background Literature Review POMDP Model Transition Probability Matrices and Reward Vectors POMDP Structural Properties Results Data Description Estimating Parameters Computational Experiments and Sensitivity Analysis Benefits of Prostate Cancer Screening Discussion Conclusions Optimization of PSA Screening Decisions Introduction POMDP Model Transition Probability Matrices and Reward Vectors Optimality Equations Results Optimal Screening Policies Benefits of Prostate Cancer Screening

8 vi Sensitivity Analysis Discussion Conclusions Optimal Coordination of Prostate Cancer Screening and Treatment Introduction Model Formulation Transition Probability Matrices and Reward Vectors Bayesian Updating Optimality Equations Methodology Upper and Lower Bounds Sampling-Based Approximation Method Core State Reduction Results Model Parameter Estimation Computational Experiments Optimal Screening and Treatment Policy Discussion Conclusions Conclusions Bibliography Appendices Appendix A

9 vii LIST OF TABLES Table 3.1 Detailed description of model parameters defining transition probabilities and rewards for the core state process Table 3.2 The age-specific values of the prostate cancer incidence rate, w t Table 3.3 Parameters, their sources, and specific values used in our base case analysis Table 3.4 The bounds on w t derived from [1] for one-way sensitivity analysis Table 3.5 Sensitivity analysis for expected QALYs for a 40-year-old patient assuming π 40 (C) = 0 comparing the optimal policy to the case of no screening. Base-cases values are shown in bold Table 4.1 Detailed description of model parameters defining transition probabilities and rewards for the core state process Table 4.2 Sensitivity analysis of the expected benefits of PSA screening for 40- year-old healthy men for different ϵ and µ. The optimal policy is compared to the case of no screening, and the case of annual PSA tests with 4.0 ng/ml threshold for biopsy from both patient and societal perspectives. The latter perspective is based on a societal willingness to pay of β = 50, 000. Since the expected QALYs of the traditional guideline is obtained by simulation, 95% confidence interval is presented in the parentheses. The base case is represented in bold Table 4.3 Sensitivity analysis of the total costs of the optimal policy and the traditional guideline for 40-year-old healthy men from the societal perspective for varying β and costs. Other parameters including ϵ and µ are set according to the base case

10 viii Table 5.1 Detailed description of model parameters defining transition probabilities and rewards for the core state process Table 5.2 Detailed description of model parameters defining transition probabilities and rewards for the core state process Table 5.3 Comparison of approximation method compared to IP. LBA denotes the lower bound algorithm, UBA denotes the upper bound algorithm, LBACR denotes the lower bound algorithm with core state reduction, and UBACR denotes the upper bound algorithm with core state reduction Table 5.4 The performance of our sampling-based approximation method under different budget constraints of k = 11, 12, 13, 30, 300, and 3000 given c k. LBA denotes the lower bound algorithm, UBA denotes the upper bound algorithm, LBACR denotes the lower bound algorithm with core state reduction, and UBACR denotes the upper bound algorithm with core state reduction Table 5.5 The performance of our sampling-based approximation method under different budget constraints c = 6, 7, 8, 9, 10, and 15, and k = 30. LBA denotes the lower bound algorithm, UBA denotes the upper bound algorithm, LBACR denotes the lower bound algorithm with core state reduction, and UBACR denotes the upper bound algorithm with core state reduction Table 5.6 The performance of our sampling-based approximation method under different randomized policies, Pr(B) = 0.1, 0.5 and 0.9. LBA denotes the lower bound algorithm, UBA denotes the upper bound algorithm, LBACR denotes the lower bound algorithm with core state reduction, and UBACR denotes the upper bound algorithm with core state reduction Table 5.7 The performance of our sampling-based approximation methods and the comparison with no screening and the optimal policy assuming RP upon prostate cancer detection in terms of the expected QALYs for people with prior belief π 40 (NC) = 1. OPRP denotes the optimal policy assuming RP upon detection, IP denotes the IP algorithm, LBA denotes the lower bound algorithm, UBA denotes the upper bound algorithm, LBACR denotes the lower bound algorithm with core state reduction, and UBACR denotes the upper bound algorithm with core state reduction

11 ix Table 5.8 Sensitivity analysis of the expected benefits of PSA screening for 40- year-old healthy men for different ϵ and µ. The optimal screening and treatment policy is compared to the case of RP immediately upon detection. The base-case values are in bold Table 5.9 Parameters, their sources and specific values used in our sensitivity analysis

12 x LIST OF FIGURES Figure 3.1 An ROC curve that illustrates the imperfect nature of PSA tests for diagnosing prostate cancer. The different points on the curve correspond to different PSA thresholds used to distinguish a suspicious and likely benign test. The curve was generated using the dataset described in Section Figure 3.2 Illustration of the typical stages of prostate cancer screening and treatment including PSA screening, biopsy, and treatment Figure 3.3 POMDP model simplification: aggregating the three non-metastatic prostate cancer stages after detection into a single core state T. Solid lines denote the transitions related to prostate cancer; dotted lines denote the action of biopsy and subsequent treatment; dashed lines in (c) denote the deaths from other causes Figure 3.4 Optimal biopsy referral policy. The solid line denotes the optimal threshold for the base case Figure 3.5 One-way sensitivity analysis for parameters: w t, d t b t, e t, z t, f, µ, ϵ, γ and λ. Solid lines denote the base-case policy and dashed lines denote the bounds Figure 3.6 One-way sensitivity analysis on optimal values for model parameters: d t, w t, ϵ, z t, µ, γ, e t, f and b t Figure 4.1 Illustration of the decisions and outcomes associated with the prostate cancer screening problem. The dashed rectangle denotes the prostate biopsy decision problem solved in Chapter 3, which is a subproblem of the screening problem we consider in this chapter Figure 4.2 Transitions among health states of the prostate cancer Markov model. Note that death from other causes is possible for any health state in our model, but omitted for simplicity Figure 4.3 Optimal prostate cancer screening policies from the patient and societal perspectives. Lines denote thresholds for PSA testing and biopsy. If the patient s age and probability of having prostate cancer are in area B they are referred for biopsy; DB means defer biopsy referral until obtaining the PSA test result in the next decision epoch; DP means defer biopsy referral and the PSA test in the next decision epoch

13 xi Figure 5.1 Recurring screening and treatment decision process for prostate cancer at decision epochs t, t + 1, π (i) t denotes the patient s belief state (probability of being in different health state) at decision stage i, P SA and P SA denotes to have a PSA test or not, B and B denotes to have a biopsy or not, and RP and RP denotes to perform RP or not. Death is possible but not shown in this figure Figure 5.2 Markovian transitions among the prostate cancer states. Partially observable states are in the dotted box, completely observable states are in the solid box, the triangle is the death from prostate cancer; transitions due to RP are represented by the dashed lines, other transitions are represented using solid lines; death from other causes is possible from all states in the model but not shown in this figure Figure 5.3 Illustration of the bounds of the approximation algorithm for a twocore state POMDP with a scalar of fully represent the belief state. (a) illustrates the true value function represented by a minimal α-vector set; (b) illustrates a lower bound on the value function formed by an outer linearization represented by a subset of the minimal α-vector set; and (c) illustrates an upper bound on the value function formed by an inner linearization represented by a set of sampled belief points and their values in red Figure 5.4 Illustration of the idea of core state reduction and how it improves sampling efficiency of the sampling-based approximation method using a two dimensional example with k = 7. (a) shows how the sampling-based approximation method samples. If the subspace of π(s 1 ) = 0 (shown in solid line in (b) and (c)) can be pre-solved, it is not necessary to sample belief points in the pre-solved subspace anymore. Therefore, higher sampling density in the area of π(s 1 ) > 0 can be achieved in (c) than that in (b) under the same budget constraint, k = Figure 5.5 One-way sensitivity analysis on optimal values for model parameters: d t, w t, µ, ϵ, γ, z t, g t and f

14 1 Chapter 1 Introduction Population screening programs are critical to the early detection of chronic diseases. For many chronic diseases, such as cancer, early detection can add years or even decades to an individual s life time. Early detection can also reduce costs to the health system by avoiding high costs associated with late stages of diseases. Until recently many life threatening diseases were only detected when late stage symptoms manifested themselves. Recent discoveries of biomarkers for certain diseases has enabled the development of screening programs with the goal of early detection and treatment. Unfortunately, most biomarkers are imperfect and can result in false positive or false negative outcomes. Therefore, a patient s true health status is often not known with certainty. This can present difficult decisions for physicians and patients that must decide whether to proceed with more invasive and expensive testings. A partially observable Markov decision process (POMDP) is a sequential decision making process that explicitly considers uncertainty about the state of a system. POMDPs have been applied widely in many contexts during the last 30 years including machine maintenance and repair, educational applications, estimating the location of a moving object, and sensor networks. POMDPs are also very well suited to the study of medical decision making in the context of diagnostic tests that provide imperfect information about a patient s true health state.

15 2 The focus of this thesis is on the investigation of new POMDP models and solution methods for the optimization of prostate cancer screening decisions. This application has potentially important societal impacts since prostate cancer is the most common solid tumor in American men, and the best screening policy for prostate cancer is highly debated [2, 3]. This dissertation is structured as the follows. First, a literature review of POMDP models, theoretical properties, algorithms, and their applications is provided in Chapter 2. This literature review also summarizes some of the most important applications of MDPs and POMDPs in the medical decision making context. In Chapter 3, a POMDP model is proposed for prostate biopsy referral decisions assuming a patient undergoes annual PSA screening. The objective is to maximize expected quality adjusted life years (QALYs). Several structural properties including the existence of a control-limit type policy for the biopsy referral decision and the condition under which the screening should be discontinued are proved. These structural properties give insights into the optimal biopsy referral policy over the course of a patient s lifetime. Age-specific prostate biopsy referral policy is obtained. Sensitivity analysis is used to evaluate how the optimal policy and expected QALYs are affected by changes to parameters in the model. In Chapter 4, a POMDP model is proposed for simultaneously making PSA screening and biopsy referral decisions. We use this model to investigate optimal policies for whether and when to have a PSA test or biopsy over the course of a patient s lifetime. The objective of the model is to maximize the difference between the reward for QALYs and the cost of screening, biopsy, and treatment. The optimal policy is compared to the case of no screening and the traditional PSA screening guideline from the published medical literature. The value of screening is measured in terms

16 3 of the expected QALYs and cost. Sensitivity analysis is performed with respect to several model inputs including cost parameters. In Chapter 5 a multi-stage POMDP is proposed for coordination of screening and treatment decisions in the presence of multiple treatment options including active surveillance and radical prostatectomy. The model is extended from the models in previous chapters to include multiple stages of actions at each epoch (PSA testing, biopsy referral, and treatment), additional core states (cancer grades), and observations. Due to the large scale of the resulting model, a new sampling-based approximation method with computational budget constraints is developed to find new optimal solutions to the extended POMDP model. A core state reduction approach suited to the particular structure of our POMDP is also used to further improve the efficiency of the sampling-based approximation method. We use computational experiments to measure the effectiveness and efficiency of our new method compared to an existing method. We demonstrated that our method can get a high quality solution within a reasonable time compared to the existing method. Empirical results for the optimal screening and treatment policy are presented. Sensitivity analysis is done to show how the availability of active surveillance (AS) influences the optimal screening policy and the expected QALYs. In Chapter 6 we summarize the most significant findings from Chapter 3, 4, and 5. We also discuss some of the limitations of our POMDP models from the perspective of prostate cancer screening. Finally, we discuss opportunities for future research.

17 4 Chapter 2 A Literature Review 2.1 Introduction A Markov decision process (MDP) defines a sequential decision process where decisions must be made without perfect knowledge of the future. MDPs are defined by states (the status of the process), actions (interventions that determine the evolution of the process), and rewards (the outcome associated with the state and action). The states of a Markov process satisfy the Markov property which means future states and rewards depend only on the current state and action, and are independent of the history of states and actions. For discrete time and discrete state MDPs, the process is described by the transition matrices which define probabilities of transition between states during decision epochs of some defined duration. Finally, a reward vector associates rewards with different states and actions of the system throughout the decision horizons. A partially observable Markov decision process (POMDP) is a generalization of an MDP in which the states are not completely observable. The unobservable states are called core states and they satisfy the Markov property. In this new sequential decision process the decision maker does not know exactly what core state the process is in at each decision epoch; however, the probability of being in the core states can be inferred based on observations of the system.

18 5 While MDPs are defined by a transition probability matrix and reward vector, POMDPs require a definition of observable states and an additional information matrix comprising the conditional probabilities of the observations given the process is in one of the underlying core states. Furthermore, the actions in a POMDP are defined on the belief state, which is a vector of probabilities of being in each of the core states. POMDPs are particularly attractive for medical decisions where a patient s true health status is not directly observable, such as the presence of prostate cancer. In such situations, physicians rely on test results which provide estimates of the probability a patient is in a certain health state. In such cases, a POMDP describes the decision making process more accurately than an MDP. The importance of formulating medical and healthcare decision problems into the POMDP framework was first suggested by Smallwood et al. [4] in However, due to the restricted computational effort for solving large size POMDPs, medical decision making did not become a serious area of research until the late 1990s [5]. A detailed review of POMDP applications in medical decision making is provided in Section 2.3. There are a number of literature reviews about POMDP. Monahan [6], Lovejoy [7], White [8], Kaelbling et al. [9], Cassandra [10] and Littman [11] all provide extensive reviews focusing on theoretical properties and solution methodologies. Cassandra [12] provides a detailed review of POMDP applications. The remainder of this chapter is different from the above referenced reviews in the following aspects. First, we focus on medical decision making and healthcare applications of POMDPs. Second, this chapter is more recent than previous reviews capturing the latest literature on POMDP, including some recent theoretical structural properties, computational methods, and

19 6 applications. This chapter is structured as the follows: Section 2.2 provides a mathematical description of a general POMDP model and defines notation which is used throughout this thesis. Section 2.3 reviews POMDP applications in the medical decision making context. Section 2.4 provides some important definitions and reviews some classic structural properties of the optimal policies for POMDPs. Section 2.5 summarizes exact and approximation algorithms that have been proposed for solving POMDPs. Finally, Section 2.6 concludes with a description of open opportunities and challenges that exist for research on POMDPs. 2.2 General POMDP Model A POMDP is an MDP with a core process satisfying the Markov property. The core process states are partially observable based on observations of a message process. Core states define the true state of the system at a decision epoch, t, and are denoted by s t S. The core process is a Markov process on the core states with transition probabilities, p t (s t+1 s t, a t ). P (a t ) is used to denote the corresponding transition probability matrix where a t A is the action in decision epoch t. At each decision epoch an observation of the message space is made by the decision maker. The core states are inferred from the observations through the conditional probabilities q t (l t s t ) (Q t denotes the corresponding matrix named information matrix) where l t M denotes the observations in the message process. Bayesian updating is used to combine observations collected at each decision epoch with the prior belief to define the current belief state. π t (s) [0, 1] denotes the probability (belief) of being in core state s at decision epoch t. We let π t = {π t (1), π t (2),, π t ( S )} denote the

20 7 corresponding vector of beliefs for all s t S. The POMDP defined on the finite core state set, S, with finite action set, A, and finite observation set, M, can be transformed to a continuous and completely observable MDP defined on a continuous S -dimensional probability space of π t Π, where Π = [0, 1] S is the belief state space. After this transformation, the reward defined on the belief state, r t (π t, a t (π t )) = s t S π t (s t )r t (s t, a t (π t )), is the expected reward over different core states in epoch t. The continuous belief state transition from π t to π t+1 is defined by a Bayesian updating process with q t+1 (l t+1 s t+1 ) p t (s t+1 s t, a t (π t ))π t (s t ) s π t+1 (s t+1 ) = t S q t+1 (l t+1 s t+1 ) p t (s t+1 s t, a t (π t ))π t (s t ). (2.1) s t+1 S Based on these definitions the optimality condition on the continuous state MDP can be written as v t (π t ) = where max a t (π t ) A r t(π t, a t (π t )) + λ p t (l t+1 π t, a t (π t )) = and λ is the discount factor. l t+1 M s t S s t+1 S q t+1 (l t+1 s t+1 ) s t S v t+1 (π t+1 )p t (l t+1 π t, a t (π t )), π t Π (2.2) p t (s t+1 s t, a t (s t ))π t (s t ), (2.3) Decision epochs increase to infinity for infinite horizon POMDPs [13]. Finite horizon POMDPs, on the other hand, have a finite terminal decision epoch, N, in which the value function v N (π N ) depends only on the terminal reward, r N (s N, a N (π N )), as follows: v N (π N ) = s S π N (s)r N (s, W ), π N Π.

21 8 POMDPs are often more difficult to solve than MDPs because they are defined on a continuous belief state. Furthermore, the number of possible policies increases superexponentially as the decision horizon increases in POMDPs. For instance, a policy tree for a finite horizon POMDP with horizon length N contains t=n M t = M N 1 t=0 M 1 possible observation nodes. At each observation node, A actions can be chosen, which makes the total number of possible policies A M N 1 M 1 [14]. As a result of the computational challenges of solving POMDPs there has been a significant amount of research on solution methods. We review several of the proposed methods in Section POMDP Applications in Medical Decision Making POMDPs have been successfully applied in many industrial application areas. Machine maintenance and replacement [15, 16] and education [17] were among the first areas of application. Other industrial applications include structural inspection [18], elevator control policies [19], fishery [20] and autonomous robot navigation [21]. However, only a small number of studies consider POMDP applications to health care and medical decision making. Although POMDPs have not been widely applied, there are many applications of MDPs in the medical decision making (see Schaefer et al. [22] for a more comprehensive review of MDPs in the context of medical decision making). For instance, Alagoz et al. [23] studied the living-donor liver transplantation timing problem. A stationary infinite horizon MDP is used to obtain the optimal time of liver transplant. They used the model for end-stage liver disease (MELD) score to define the health state of

22 9 the patient. There are two actions, wait and transplant, and the objective is to maximize expected quality adjusted lifespan for the patient, where lifespan is composed of pre and post-transplant portions. Structural properties such as the existence of a control-limit policy were proved under specific assumptions about the rewards and transition probabilities. The optimal transplant strategy was reported for different disease groups of patients. Denton et al. and Kurt et al. [24, 25] studied the optimal start time of statin therapy patients with for type 2 diabetes. Total cholesterol and High-density lipoprotein (HDL) were used to define a finite set of health states. The authors considered the objective to maximize expected QALYs from the patient perspective, and maximize the weighted difference of rewards for QALYs and costs of treatments. Additional applications of MDPs to medical decision making are reviewed in Schaefer et al. [22]. Existing POMDP applications in health care and medical decisions are much fewer than MDP applications. However, since the true health or disease states are usually unknown or difficult to know, POMDP applications are recently becoming more common. The first POMDP application to medical decision making was proposed by Smallwood et al. [4] in The authors define the core states as the patients disease status. They provided an information state diagram to visualize the belief states. However, their idea was quite general they did not formulate a POMDP model for a specific medical decision making problem. Hu et al. [26] formulated an optimal drug infusion problem with uncertain pharmacokinetics as a POMDP. They use their POMDP to choose a drug infusion regimen to keep the concentration of drug in the patient s blood plasma at a predetermined level. The state space comprised the finite intervals of the volume, clearance, and current drug concentration. However, they did not solve this POMDP model to op-

23 10 timality. Instead, they defined some easy to implement drug infusion policies and examined and compared their performance in simulations. Hauskrecht et al. [5, 27] applied a POMDP formulation to the problem of treating patients with ischemic heart disease. This appears to be the first example of solving a real POMDP in the context of medical decision making. The core states are based on the health status of a patient including death as an observable absorbing state. Observations of the message process are the test results (e.g. ischemia level, catheter coronary artery result, and stress test result) and history of surgical procedures; actions include treatment actions (wait, medication treatment, angioplasty and coronary artery bypass graft surgery), and investigative actions (stress test and angiogram investigation) to collect information relevant to the core state of the patient. They acquired the parameters for transition probabilities and rewards from the medical literature, or inferred them from available data at a particular medical center. Bounds on the optimal policies were obtained by proposed approximation algorithms. One of the proposed algorithms, the fast informed bound method, selects the best linear function for every observation and every current state separately. Another algorithm, incremental linear function approach, gradually improves the convex and piecewise linear lower bound of a finite fixed grid-based approximation method. Due to the complexity of the model, the gap between the upper and lower bounds was significant for some of their numerical experiments. Peek [28] formulated time-critical management problems in medicine as a POMDP model. The author discussed a problem of clinical treatment of children with a ventricular septal defect as an example of a time-critical clinical management problem. Although the author provided the detailed description of decision horizons, states, actions, transition probabilities, observations and rewards, he neither provided values

24 11 of the parameters, nor solved the POMDP model for this problem. Tusch [29] modeled the optimal therapy plan for liver transplantation using a POMDP. His goal was to find an optimal clinical management strategy based on a risk assessment of patients. The core states were risk and non-risk; the actions include therapeutic actions such as surgery, and test actions. There are a total of 24 possible clinical tests, grouped in three scores, resulting in three observations of the restricted model. The problem was reduced to a three decision-epoch constrained POMDP. The author used artificial neural networks to estimate the probabilities in the information matrix. The proposed constrained POMDP formulation was transformed to a non-linear optimization problem to solve using robust partial classification methods such as artificial neural networks and linear discriminant analysis by considering the POMDP as a classification procedure. Kreke [30] used a finite horizon POMDP model to answer the question about when to test for cytokine levels (a predictor of sepsis patient survival) using potentially costly and inaccurate testing procedures in hospitals. The decision horizon was the time from a patient s admission to the hospital to discharge. The objective was to maximize the patient s expected survival time. Unobservable core states defined the patient s health status. Observations were the patient s cytokine level which are subject to error and may not be patient s true cytokine level. Actions comprised discharge of the patient from the hospital without testing, order of a cytokine test, and keeping the patient in the hospital for one more decision epoch. The cost of ordering a cyotokine test was converted into patient life days using cost-effectiveness analysis, and then incorporated into the rewards. A finite-fixed grid method was employed to transform the POMDP into a MDP to solve. Although control-limit type policies were observed in empirical results, the author neither proved nor provide conditions

25 12 for the existence of the control-limit type policy for the proposed POMDP model. Sensitivity analysis was done for test accuracy and cost. Fard et al. [31] investigated the comparative effectiveness of different treatments provided sequentially for patients suffering from depression using a POMDP, although the focus of this paper was to propose a method for estimating bias and variance of the value function. The unobservable states are the levels of depression and the observations are a numerical score called the quick inventory of depressive symptomatology, which roughly indicates the level of depression. They used this medical application as an example to evaluate the precision of the proposed method. They compared policies with different choices of medications and gave intervals for different policies. Goulionis et al. [32] employed POMDP models in Parkinson disease treatment optimization. The core states are three levels of a patient s Parkinson disease status. The observations are the characteristics obtained by clinical examinations. The actions are medical treatment, with incomplete monitoring, and surgical treatment. Their goal was to find the belief threshold for surgery in order to minimize an objective including a combination of QALYs and monetary values. They used the policy iteration algorithm from [13] to solve the formulated infinite horizon stationary POMDP. Their core states were the Parkinson disease degrees and Bayesian updating was based on various diagnostic observations. Optimal average cost policies for patients with Parkinson disease with three deterioration levels were obtained based on clinical data in Athens, Greece. Ivy [33] formulated a POMDP model for breast cancer decision and treatment. She considered both third party payer s and the patient s perspective. The payer s perspective was to minimize the cost associated with monitoring and treating breast cancer, and the patient s perspective was to maximize the expected discounted total

26 13 utility (QALYs). In her model, cancer states were the partially observable core states and actions were screening and treatment options. Results of clinical breast exams and mammograms are the observations. An algorithm that sequentially selects the policy for the constrained POMDP was used in obtaining the optimal policy and constructing the trade-off curves between cost and utility. In Maillart et al. [34], the authors used a partially observable Markov chain to study breast cancer screening policies using mammography. They evaluated agedependent screening policies and studied the tradeoff between lifetime mortality risk of breast cancer and the expected number of mammograms. They generated the efficient frontier for the evaluated policies measured by life-time mortality risk and expected mammogram count, and demonstrated the robustness of the resulting frontier. Chhatwal et al. [35] studied a breast cancer biopsy optimization problem based on the mammography observations. In their MDP model, they use a set of discretized probability of breast cancer as their states. They use a mammography Bayesian Network to estimate the probability. They also proposed a PODMP model assuming the mammography Bayesian Network is not perfect. They conclude that their POMDP model is not as good as the MDP model since they do not have good estimates for core-state transition and observation probabilities. 2.4 Structural Properties of POMDPs Some POMDPs have optimal policies that exhibit special structures. For instance, the existence of a control-limit type policy means there exists hyperplanes separating the decision space (belief state space in a POMDP) into parts within which different

27 14 actions are optimal. To develop a rigorous theoretical description of these properties, we first provide definitions of some common terms used in the literature. Definition 2.1. An n n matrix A is totally positive of order k, denoted by TP k, if A p is nonnegative for all p = 1,, k, where A p is the pth compound matrix of A defined as the ( n p ) -square matrix of the p minors of A. Definition 2.2. An m n matrix A has the increasing failure rate (IFR) property if n and only if A ij n A i j, i < i {1,, m}, k {1,, n}. j=k j=k Definition 2.3. Stochastic dominance (first order): the mass function p is stochastically less than or equal to the mass function q, denoted by p s q, if N N p(k) q(k), m, 0 m N. k=m k=m Definition 2.4. The mass function p is less than or equal to the mass function q in the sense of monotone likelihood ratio (MLR), denoted by p r q, if q(k)/p(k) is a nondecreasing function of k (excluding k such that p(k) = q(k) = 0). Definition 2.5. Blackwell ordering: Let X and Y be standard Borel spaces. Given two transition probabilities P and Q from X to Y, we say that P is less informative than Q (P B Q) if there exists a transition probability K from Y to Y such that P (x; C) = Q(x; dy)k(y; C), x X and for all measurable C Y. Structural properties of POMDPs have been investigated by many researchers for more than thirty years. Sondik and Smallwood [36, 37] showed that the optimal objective function in a maximization problem is piecewise linear and convex in the belief

28 15 state for any given decision horizon. This is the basis of many proposed POMDP algorithms. For infinite horizon POMDPs, Sondik [13] showed the convex property still exists, however, the piecewise linearity may not hold. Instead, the optimal objective function can be approximated arbitrarily closely by a piecewise linear and convex function. Albright [38] gave some conditions for which two-core-state, two-action POMDPs have monotonic value functions and control limit type policies. The author proposed two types of models one obtains an observation after each state transition and the other first has a state transition followed by an observation. Both models have similar structural properties. In order to derive monotonicity results, the author first showed that the m n matrix is TP 2 if and only if the matrix has the IFR property and n = 2. The author also provided sufficient conditions for π t+1 to be isotone in π t, a t and l t+1, and established monotonicity of the value function given the information matrix, Q, and core state transition probability matrix, P, are both TP 2, and there exists an ordering of states such that the reward function is nondecreasing. The monotonicity of the optimal policy at each decision horizon also requires the reward function be supperadditive or subadditive. White [39] provided some conditions for the existence of optimal control-limit type policies for special cases of completely observed and completely unobserved POMDPs. The author discussed two extreme cases of the general POMDP. There are two main contributions of this paper. First, it demonstrates the sufficient conditions for the existence of monotone optimal control laws for general POMDPs are restrictive and difficult to verify; second, it emphasizes the potential usefulness of the two extreme cases in determining bounds on optimal solution to a POMDP. Lovejoy [40] presents weaker conditions for monotonicity of policies for more gen-

29 16 eral POMDPs. The author showed that the optimal value function in a discrete-time, finite core state POMDP, is monotone on the space of belief vectors ordered by likelihood ratios. He required the state probability (information) vector to be MLR, and used a machine replacement example to illustrate the conditions. Rieder [41] proposed conditions for monotonicity results for the value function and optimal policy based on the T P 2 and Blackwell stochastic orderings. The author also proposed a more general POMDP formulation than those in [40] and other earlier researches. The author showed how the value functions depend on the observation using Blackwell ordering and presented conditions for a lower bound of the optimal policy. The results carry over from the finite horizon to the discounted infinite horizon case. These results extend and complete investigations of Albright [38], White [39] and Lovejoy [40] and they can be used to derive further structural properties of optimal policies for some special types of partially observed control models such as Bayesian control models. Recently Grosfeld-Nir [42] proved that dominance in expectation which is weaker than stochastic dominance suffices for the optimal policy to be control-limit type in a two-core-state, two-action problems. It can be treated as an extension to Albright [38]. 2.5 Computational Methods Computational methods for POMDPs were first discussed in the 1970s [36]. Since then many exact and approximation algorithms have been proposed and developed in the operations research, computer science and artificial intelligence communities. In this section, we review exact and approximation algorithms. The reader is referred to [9] for a recent and detailed review of algorithmic methods for POMDPs.

30 Exact Algorithms Sondik and Smallwood [36, 37] first proved that the finite horizon POMDP value function is piecewise linear and convex at each decision epoch for a maximization problem. Each given sequence of actions and observations results in a specific vector (hyperplane) in the belief space commonly referred to as α-vectors. The set of all the vectors corresponding to all the policies is called the α-vector set. The convex hull representing the optimal value function is constructed by the epigraph of the vectors of all possible policies. Since some vectors in the α-vector set are dominated, the epigraphc can be often represented by a smaller subset of α-vectors, called the minimal α-vector set or parsimonious representation of the value function. The first exact algorithm, called the one-pass algorithm, was also proposed in [36, 37] and standardized later by Monahan [6]. In order to obtain the minimal α-vector set, the one-pass algorithm solves a linear program for every α-vector in order to obtain the minimal set. However, the computational effort could be large since the number of constraints in each linear program is the total number of α-vectors. This shortcoming has become the target of later algorithmic improvements which try to find the minimal α-vector set more efficiently [14, 43]. White [8] proposed a more efficient routine (also known as Lark s method since the routine was originally proposed by J.W. Lark in a private communication with the author) to reduce the set of α-vectors to the minimal set.. This routine generates the minimal α-vector set beginning with the null set. Thus, the linear program used to identify dominance has fewer constraints than the linear program in the one-pass algorithm, which enumerates all α-vectors. More recently, Littman [14] proposed the witness algorithm which improves on

31 18 the algorithm provided by White [8]. The witness algorithm divides the problem into small subproblems according to different actions in order to reduce the number of constraints in each linear program in identifying the minimal α-vector set. Each linear program finds a witness belief point at which another α-vector is found dominating all other α-vectors in the current minimal set, and is added into the minimal α-vector set of a action. Finally, the union of minimal α-vector sets for different actions are purged into the minimal α-vector set of the optimal value function. The author used lexicographic ordering to break the ties and guarantee the final α-vector set is of minimum size. Additionally, the author analyzed the performance of finite-horizon approximations to infinite horizon POMDPs. Zhang [43] developed an algorithm called incremental pruning. The algorithm can be viewed as an extension of the witness algorithm. It does not search the regions of the entire state space; instead, it constructs each possible α-vector in the minimal set in an incremental fashion by taking advantage of the decomposable nested structure of the value function of a POMDP. In Cassandra et al. [44], this algorithm is specified to be incrementally purging the α-vectors associated with different observations regarding to the α-vector set of a specific action. More specifically, the minimal α-vector set associated with a specific action can be decomposed into the vector subsets according to the corresponding observations. The vector subsets are added one by one, and the dominated vectors are pruned each time a new subset is added. The algorithm was named after this special pruning procedure. This algorithm is shown to be more efficient than other previous exact algorithms including the witness algorithm.

32 Approximation Algorithms Approximation algorithms are often necessary for large-scale POMDPs to obtain good (hopefully near optimal) solutions. Approximation algorithms for POMDPs have been widely developed for decades; hence there are many proposed algorithms. In this section, we will review some of the more common approximation algorithms. A more thorough review can be found in [7]. The most intuitive approximation algorithm for a POMDP is to discretize the continuous belief state and solve it as an MDP. Eckles [45] was the first to use this idea to solve POMDPs problems. Continuous belief states are discretized into finite fixed grids. Approximate optimal value functions are computed at belief points on the grid at each decision epoch t. They are computed from the approximate optimal values of all possible posterior belief points at epoch t + 1. Linear interpolation is used in approximating the value functions for belief points between two adjacent grid points. This method was also named as fixed-grid method. Lovejoy [7] provides a detailed review of these methods including bounds on the approximation error. Kaelbling [9] discusses extensions such as nonlinear interpolation and related approximations. The Finite memory approximation uses information from a fixed future horizon from the current decision epoch to approximate the objective function. The method was introduced in Sondik [36, 13] to approximate infinite horizon POMDPs. Platzman [46] proposed a finite-memory approximation using a finite length of the most recent actions and observations. Platzman generalized the finite memory idea to finite memory states which could be aggregations of recent observations and actions. A memory state transition occurs when there is a new action or observation. Platzman also presents methods to bound the approximation of the optimal value, and random-

Optimal Design of Biomarker-Based Screening Strategies for Early Detection of Prostate Cancer

Optimal Design of Biomarker-Based Screening Strategies for Early Detection of Prostate Cancer Optimal Design of Biomarker-Based Screening Strategies for Early Detection of Prostate Cancer Brian Denton Department of Industrial and Operations Engineering University of Michigan, Ann Arbor, MI October

More information

Markov Decision Processes for Screening and Treatment of Chronic Diseases

Markov Decision Processes for Screening and Treatment of Chronic Diseases Markov Decision Processes for Screening and Treatment of Chronic Diseases Lauren N. Steimle and Brian T. Denton Abstract In recent years, Markov decision processes (MDPs) and partially obserable Markov

More information

Prostate cancer is the most common solid tumor in American men and is screened for using prostate-specific

Prostate cancer is the most common solid tumor in American men and is screened for using prostate-specific MANUFACTURING & SERVICE OPERATIONS MANAGEMENT Vol. 14, No. 4, Fall 2012, pp. 529 547 ISSN 1523-4614 (print) ISSN 1526-5498 (online) http://dx.doi.org/10.1287/msom.1120.0388 2012 INFORMS Optimization of

More information

Manufacturing and Services Operations Management. Optimization of Prostate Biopsy Referral Decisions

Manufacturing and Services Operations Management. Optimization of Prostate Biopsy Referral Decisions Optimization of Prostate Biopsy Referral Decisions Journal: Manufacturing and Service Operations Management Manuscript ID: MSOM--0.R Manuscript Type: Special Issue - Healthcare Keywords: Health Care Management,

More information

ABSTRACT. MASON, JENNIFER E. Optimal Timing of Statin Initiation for Patients with Type 2 Diabetes. (Under the direction of Dr. Brian Denton).

ABSTRACT. MASON, JENNIFER E. Optimal Timing of Statin Initiation for Patients with Type 2 Diabetes. (Under the direction of Dr. Brian Denton). ABSTRACT MASON, JENNIFER E. Optimal Timing of Statin Initiation for Patients with Type 2 Diabetes. (Under the direction of Dr. Brian Denton). HMG Co-A reductase inhibitors (statins) are an important part

More information

Using Longitudinal Data to Build Natural History Models

Using Longitudinal Data to Build Natural History Models Using Longitudinal Data to Build Natural History Models Lessons learned from modeling type 2 diabetes and prostate cancer INFORMS Healthcare Conference, Rotterdam, 2017 Brian Denton Department of Industrial

More information

Markov Decision Processes for Chronic Diseases Lessons learned from modeling type 2 diabetes

Markov Decision Processes for Chronic Diseases Lessons learned from modeling type 2 diabetes Markov Decision Processes for Chronic Diseases Lessons learned from modeling type 2 diabetes Brian Denton Department of Industrial and Operations Engineering University of Michigan Agenda Models for study

More information

Optimization of PSA Screening Policies: a Comparison of the Patient and Societal Perspectives

Optimization of PSA Screening Policies: a Comparison of the Patient and Societal Perspectives University of Massachusetts Amherst From the SelectedWorks of Hari Balasubramanian March, 2012 Optimization of PSA Screening Policies: a Comparison of the Patient and Societal Perspectives Jingyu Zhang

More information

Remarks on Bayesian Control Charts

Remarks on Bayesian Control Charts Remarks on Bayesian Control Charts Amir Ahmadi-Javid * and Mohsen Ebadi Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran * Corresponding author; email address: ahmadi_javid@aut.ac.ir

More information

Sequential Decision Making

Sequential Decision Making Sequential Decision Making Sequential decisions Many (most) real world problems cannot be solved with a single action. Need a longer horizon Ex: Sequential decision problems We start at START and want

More information

Stochastic Models for Improving Screening and Surveillance Decisions for Prostate Cancer Care

Stochastic Models for Improving Screening and Surveillance Decisions for Prostate Cancer Care Stochastic Models for Improving Screening and Surveillance Decisions for Prostate Cancer Care by Christine Barnett A dissertation submitted in partial fulfillment of the requirements for the degree of

More information

Evaluating the Impact of Different Perspectives on the Optimal Start Time for Statins

Evaluating the Impact of Different Perspectives on the Optimal Start Time for Statins Evaluating the Impact of Different Perspectives on the Optimal Start Time for Statins Jennifer E. Mason Edward P. Fitts Department of Industrial and Systems Engineering, NC State University, Raleigh, NC

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Remarks on Bayesian Control Charts

Remarks on Bayesian Control Charts Remarks on Bayesian Control Charts Amir Ahmadi-Javid * and Mohsen Ebadi Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran * Corresponding author; email address: ahmadi_javid@aut.ac.ir

More information

Using Markov Models to Estimate the Impact of New Prostate Cancer Biomarkers

Using Markov Models to Estimate the Impact of New Prostate Cancer Biomarkers Using Markov Models to Estimate the Impact of New Prostate Cancer Biomarkers Brian Denton, PhD Associate Professor Department of Industrial and Operations Engineering February 23, 2016 Industrial and Operations

More information

Optimization in Medicine

Optimization in Medicine Optimization in Medicine INFORMS Healthcare Conference, Rotterdam, 2017 Brian Denton Department of Industrial and Operations Engineering University of Michigan Optimization in Medicine OR in Medicine Cancer

More information

Optimal Design of Multiple Medication Guidelines for Type 2 Diabetes Patients

Optimal Design of Multiple Medication Guidelines for Type 2 Diabetes Patients Optimal Design of Multiple Medication Guidelines for Type 2 Diabetes Patients Jennifer E. Mason PhD Student Edward P. Fitts Department of Industrial and Systems Engineering NC State University, Raleigh,

More information

c Copyright 2010 by Daniel J. Underwood All Rights Reserved

c Copyright 2010 by Daniel J. Underwood All Rights Reserved ABSTRACT UNDERWOOD, DANIEL J. Simulation Optimization of Prostate Cancer Screening Using a Parallel Genetic Algorithm. (Under the direction of Dr. Brian T. Denton.) We develop a parallel simulation-optimization

More information

Linear and Nonlinear Optimization

Linear and Nonlinear Optimization Linear and Nonlinear Optimization SECOND EDITION Igor Griva Stephen G. Nash Ariela Sofer George Mason University Fairfax, Virginia Society for Industrial and Applied Mathematics Philadelphia Contents Preface

More information

Lecture 13: Finding optimal treatment policies

Lecture 13: Finding optimal treatment policies MACHINE LEARNING FOR HEALTHCARE 6.S897, HST.S53 Lecture 13: Finding optimal treatment policies Prof. David Sontag MIT EECS, CSAIL, IMES (Thanks to Peter Bodik for slides on reinforcement learning) Outline

More information

Introduction to Cost-Effectiveness Analysis

Introduction to Cost-Effectiveness Analysis Introduction to Cost-Effectiveness Analysis Janie M. Lee, MD, MSc RSNA Clinical Trials Methodology Workshop January 13, 2016 Why is Clinical Research Important to Radiology? Radiology currently occupies

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models

A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models A Decision-Theoretic Approach to Evaluating Posterior Probabilities of Mental Models Jonathan Y. Ito and David V. Pynadath and Stacy C. Marsella Information Sciences Institute, University of Southern California

More information

Partially-Observable Markov Decision Processes as Dynamical Causal Models. Finale Doshi-Velez NIPS Causality Workshop 2013

Partially-Observable Markov Decision Processes as Dynamical Causal Models. Finale Doshi-Velez NIPS Causality Workshop 2013 Partially-Observable Markov Decision Processes as Dynamical Causal Models Finale Doshi-Velez NIPS Causality Workshop 2013 The POMDP Mindset We poke the world (perform an action) Agent World The POMDP Mindset

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

Towards Learning to Ignore Irrelevant State Variables

Towards Learning to Ignore Irrelevant State Variables Towards Learning to Ignore Irrelevant State Variables Nicholas K. Jong and Peter Stone Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 {nkj,pstone}@cs.utexas.edu Abstract

More information

DESIGNING PERSONALIZED TREATMENT: AN APPLICATION TO ANTICOAGULATION THERAPY

DESIGNING PERSONALIZED TREATMENT: AN APPLICATION TO ANTICOAGULATION THERAPY DESIGNING PERSONALIZED TREATMENT: AN APPLICATION TO ANTICOAGULATION THERAPY by Rouba Ibrahim UCL School of Management, University College London, London, UK rouba.ibrahim@ucl.ac.uk, Tel: (44)20-76793278

More information

Learning to Identify Irrelevant State Variables

Learning to Identify Irrelevant State Variables Learning to Identify Irrelevant State Variables Nicholas K. Jong Department of Computer Sciences University of Texas at Austin Austin, Texas 78712 nkj@cs.utexas.edu Peter Stone Department of Computer Sciences

More information

The optimism bias may support rational action

The optimism bias may support rational action The optimism bias may support rational action Falk Lieder, Sidharth Goel, Ronald Kwan, Thomas L. Griffiths University of California, Berkeley 1 Introduction People systematically overestimate the probability

More information

Dynamic Control Models as State Abstractions

Dynamic Control Models as State Abstractions University of Massachusetts Amherst From the SelectedWorks of Roderic Grupen 998 Dynamic Control Models as State Abstractions Jefferson A. Coelho Roderic Grupen, University of Massachusetts - Amherst Available

More information

BREAST CANCER EPIDEMIOLOGY MODEL:

BREAST CANCER EPIDEMIOLOGY MODEL: BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

Jennifer E. Mason (803)

Jennifer E. Mason   (803) Jennifer E. Mason http://people.engr.ncsu.edu/jemason2/ jemason2@ncsu.edu, (803) 608-0727 Research Interests Stochastic dynamic programming and stochastic models Applications in health care delivery and

More information

Bayes Linear Statistics. Theory and Methods

Bayes Linear Statistics. Theory and Methods Bayes Linear Statistics Theory and Methods Michael Goldstein and David Wooff Durham University, UK BICENTENNI AL BICENTENNIAL Contents r Preface xvii 1 The Bayes linear approach 1 1.1 Combining beliefs

More information

Solutions for Chapter 2 Intelligent Agents

Solutions for Chapter 2 Intelligent Agents Solutions for Chapter 2 Intelligent Agents 2.1 This question tests the student s understanding of environments, rational actions, and performance measures. Any sequential environment in which rewards may

More information

Chapter 02. Basic Research Methodology

Chapter 02. Basic Research Methodology Chapter 02 Basic Research Methodology Definition RESEARCH Research is a quest for knowledge through diligent search or investigation or experimentation aimed at the discovery and interpretation of new

More information

Mining Low-Support Discriminative Patterns from Dense and High-Dimensional Data. Technical Report

Mining Low-Support Discriminative Patterns from Dense and High-Dimensional Data. Technical Report Mining Low-Support Discriminative Patterns from Dense and High-Dimensional Data Technical Report Department of Computer Science and Engineering University of Minnesota 4-192 EECS Building 200 Union Street

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Gene Selection for Tumor Classification Using Microarray Gene Expression Data Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology

More information

A Framework for Sequential Planning in Multi-Agent Settings

A Framework for Sequential Planning in Multi-Agent Settings A Framework for Sequential Planning in Multi-Agent Settings Piotr J. Gmytrasiewicz and Prashant Doshi Department of Computer Science University of Illinois at Chicago piotr,pdoshi@cs.uic.edu Abstract This

More information

Data-Driven Management of Post-Transplant Medications: An APOMDP Approach

Data-Driven Management of Post-Transplant Medications: An APOMDP Approach Data-Driven Management of Post-Transplant Medications: An APOMDP Approach Alireza Boloori Industrial Engineering, School of Computing, Informatics and Decision Systems Engineering, Arizona State University,

More information

Introduction to Bayesian Analysis 1

Introduction to Bayesian Analysis 1 Biostats VHM 801/802 Courses Fall 2005, Atlantic Veterinary College, PEI Henrik Stryhn Introduction to Bayesian Analysis 1 Little known outside the statistical science, there exist two different approaches

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the

More information

Folland et al Chapter 4

Folland et al Chapter 4 Folland et al Chapter 4 Chris Auld Economics 317 January 11, 2011 Chapter 2. We won t discuss, but you should already know: PPF. Supply and demand. Theory of the consumer (indifference curves etc) Theory

More information

Data-Driven Management of Post-Transplant Medications: An APOMDP Approach

Data-Driven Management of Post-Transplant Medications: An APOMDP Approach Data-Driven Management of Post-Transplant Medications: An APOMDP Approach Alireza Boloori Industrial Engineering, School of Computing, Informatics and Decision Systems Engineering, Arizona State University,

More information

Identification of Tissue Independent Cancer Driver Genes

Identification of Tissue Independent Cancer Driver Genes Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important

More information

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi

More information

Predicting Breast Cancer Survivability Rates

Predicting Breast Cancer Survivability Rates Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Computerized Mastery Testing

Computerized Mastery Testing Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

Decision Analysis. John M. Inadomi. Decision trees. Background. Key points Decision analysis is used to compare competing

Decision Analysis. John M. Inadomi. Decision trees. Background. Key points Decision analysis is used to compare competing 5 Decision Analysis John M. Inadomi Key points Decision analysis is used to compare competing strategies of management under conditions of uncertainty. Various methods may be employed to construct a decision

More information

References. Christos A. Ioannou 2/37

References. Christos A. Ioannou 2/37 Prospect Theory References Tversky, A., and D. Kahneman: Judgement under Uncertainty: Heuristics and Biases, Science, 185 (1974), 1124-1131. Tversky, A., and D. Kahneman: Prospect Theory: An Analysis of

More information

Estimating the number of components with defects post-release that showed no defects in testing

Estimating the number of components with defects post-release that showed no defects in testing SOFTWARE TESTING, VERIFICATION AND RELIABILITY Softw. Test. Verif. Reliab. 2002; 12:93 122 (DOI: 10.1002/stvr.235) Estimating the number of components with defects post-release that showed no defects in

More information

POND-Hindsight: Applying Hindsight Optimization to POMDPs

POND-Hindsight: Applying Hindsight Optimization to POMDPs POND-Hindsight: Applying Hindsight Optimization to POMDPs Alan Olsen and Daniel Bryce alan@olsen.org, daniel.bryce@usu.edu Utah State University Logan, UT Abstract We present the POND-Hindsight entry in

More information

Chapter 17 Sensitivity Analysis and Model Validation

Chapter 17 Sensitivity Analysis and Model Validation Chapter 17 Sensitivity Analysis and Model Validation Justin D. Salciccioli, Yves Crutain, Matthieu Komorowski and Dominic C. Marshall Learning Objectives Appreciate that all models possess inherent limitations

More information

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES Amit Teller 1, David M. Steinberg 2, Lina Teper 1, Rotem Rozenblum 2, Liran Mendel 2, and Mordechai Jaeger 2 1 RAFAEL, POB 2250, Haifa, 3102102, Israel

More information

Adversarial Decision-Making

Adversarial Decision-Making Adversarial Decision-Making Brian J. Stankiewicz University of Texas, Austin Department Of Psychology & Center for Perceptual Systems & Consortium for Cognition and Computation February 7, 2006 Collaborators

More information

Reinforcement Learning

Reinforcement Learning Reinforcement Learning Michèle Sebag ; TP : Herilalaina Rakotoarison TAO, CNRS INRIA Université Paris-Sud Nov. 9h, 28 Credit for slides: Richard Sutton, Freek Stulp, Olivier Pietquin / 44 Introduction

More information

Exploring Experiential Learning: Simulations and Experiential Exercises, Volume 5, 1978 THE USE OF PROGRAM BAYAUD IN THE TEACHING OF AUDIT SAMPLING

Exploring Experiential Learning: Simulations and Experiential Exercises, Volume 5, 1978 THE USE OF PROGRAM BAYAUD IN THE TEACHING OF AUDIT SAMPLING THE USE OF PROGRAM BAYAUD IN THE TEACHING OF AUDIT SAMPLING James W. Gentry, Kansas State University Mary H. Bonczkowski, Kansas State University Charles W. Caldwell, Kansas State University INTRODUCTION

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions

Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Seminar Thesis: Efficient Planning under Uncertainty with Macro-actions Ragnar Mogk Department of Computer Science Technische Universität Darmstadt ragnar.mogk@stud.tu-darmstadt.de 1 Introduction This

More information

Electronic Health Record Analytics: The Case of Optimal Diabetes Screening

Electronic Health Record Analytics: The Case of Optimal Diabetes Screening Electronic Health Record Analytics: The Case of Optimal Diabetes Screening Michael Hahsler 1, Farzad Kamalzadeh 1 Vishal Ahuja 1, and Michael Bowen 2 1 Southern Methodist University 2 UT Southwestern Medical

More information

Probabilistic Graphical Models: Applications in Biomedicine

Probabilistic Graphical Models: Applications in Biomedicine Probabilistic Graphical Models: Applications in Biomedicine L. Enrique Sucar, INAOE Puebla, México May 2012 What do you see? What we see depends on our previous knowledge (model) of the world and the information

More information

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes

Using Eligibility Traces to Find the Best Memoryless Policy in Partially Observable Markov Decision Processes Using Eligibility Traces to Find the est Memoryless Policy in Partially Observable Markov Decision Processes John Loch Department of Computer Science University of Colorado oulder, CO 80309-0430 loch@cs.colorado.edu

More information

Using AUC and Accuracy in Evaluating Learning Algorithms

Using AUC and Accuracy in Evaluating Learning Algorithms 1 Using AUC and Accuracy in Evaluating Learning Algorithms Jin Huang Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 fjhuang, clingg@csd.uwo.ca

More information

Data-Driven Management of Post- Transplant Medications: An APOMDP Approach Faculty Research Working Paper Series

Data-Driven Management of Post- Transplant Medications: An APOMDP Approach Faculty Research Working Paper Series Data-Driven Management of Post- Transplant Medications: An APOMDP Approach Faculty Research Working Paper Series Alireza Boloori Arizona State University Soroush Saghafian Harvard Kennedy School Harini

More information

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n.

Citation for published version (APA): Ebbes, P. (2004). Latent instrumental variables: a new approach to solve for endogeneity s.n. University of Groningen Latent instrumental variables Ebbes, P. IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Macroeconometric Analysis. Chapter 1. Introduction

Macroeconometric Analysis. Chapter 1. Introduction Macroeconometric Analysis Chapter 1. Introduction Chetan Dave David N. DeJong 1 Background The seminal contribution of Kydland and Prescott (1982) marked the crest of a sea change in the way macroeconomists

More information

Cost-effectiveness ratios are commonly used to

Cost-effectiveness ratios are commonly used to ... HEALTH ECONOMICS... Application of Cost-Effectiveness Analysis to Multiple Products: A Practical Guide Mohan V. Bala, PhD; and Gary A. Zarkin, PhD The appropriate interpretation of cost-effectiveness

More information

Strategic Level Proton Therapy Patient Admission Planning: A Markov Decision Process Modeling Approach

Strategic Level Proton Therapy Patient Admission Planning: A Markov Decision Process Modeling Approach University of New Haven Digital Commons @ New Haven Mechanical and Industrial Engineering Faculty Publications Mechanical and Industrial Engineering 6-2017 Strategic Level Proton Therapy Patient Admission

More information

Representation and Analysis of Medical Decision Problems with Influence. Diagrams

Representation and Analysis of Medical Decision Problems with Influence. Diagrams Representation and Analysis of Medical Decision Problems with Influence Diagrams Douglas K. Owens, M.D., M.Sc., VA Palo Alto Health Care System, Palo Alto, California, Section on Medical Informatics, Department

More information

Personalized Decision Modeling for Intervention and Prevention of Cancers

Personalized Decision Modeling for Intervention and Prevention of Cancers University of Arkansas, Fayetteville ScholarWorks@UARK Theses and Dissertations 8-2017 Personalized Decision Modeling for Intervention and Prevention of Cancers Fan Wang University of Arkansas, Fayetteville

More information

Probability II. Patrick Breheny. February 15. Advanced rules Summary

Probability II. Patrick Breheny. February 15. Advanced rules Summary Probability II Patrick Breheny February 15 Patrick Breheny University of Iowa Introduction to Biostatistics (BIOS 4120) 1 / 26 A rule related to the addition rule is called the law of total probability,

More information

Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel

Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel Jane Lange March 22, 2017 1 Acknowledgements Many thanks to the multiple project

More information

Bayesian Reinforcement Learning

Bayesian Reinforcement Learning Bayesian Reinforcement Learning Rowan McAllister and Karolina Dziugaite MLG RCC 21 March 2013 Rowan McAllister and Karolina Dziugaite (MLG RCC) Bayesian Reinforcement Learning 21 March 2013 1 / 34 Outline

More information

A Belief-Based Account of Decision under Uncertainty. Craig R. Fox, Amos Tversky

A Belief-Based Account of Decision under Uncertainty. Craig R. Fox, Amos Tversky A Belief-Based Account of Decision under Uncertainty Craig R. Fox, Amos Tversky Outline Problem Definition Decision under Uncertainty (classical Theory) Two-Stage Model Probability Judgment and Support

More information

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15) ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer

More information

Lecture Outline Biost 517 Applied Biostatistics I

Lecture Outline Biost 517 Applied Biostatistics I Lecture Outline Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 2: Statistical Classification of Scientific Questions Types of

More information

Irrationality in Game Theory

Irrationality in Game Theory Irrationality in Game Theory Yamin Htun Dec 9, 2005 Abstract The concepts in game theory have been evolving in such a way that existing theories are recasted to apply to problems that previously appeared

More information

Fuzzy Decision Tree FID

Fuzzy Decision Tree FID Fuzzy Decision Tree FID Cezary Z. Janikow Krzysztof Kawa Math & Computer Science Department Math & Computer Science Department University of Missouri St. Louis University of Missouri St. Louis St. Louis,

More information

Appendix I Teaching outcomes of the degree programme (art. 1.3)

Appendix I Teaching outcomes of the degree programme (art. 1.3) Appendix I Teaching outcomes of the degree programme (art. 1.3) The Master graduate in Computing Science is fully acquainted with the basic terms and techniques used in Computing Science, and is familiar

More information

Interaction as an emergent property of a Partially Observable Markov Decision Process

Interaction as an emergent property of a Partially Observable Markov Decision Process Interaction as an emergent property of a Partially Observable Markov Decision Process Andrew Howes, Xiuli Chen, Aditya Acharya School of Computer Science, University of Birmingham Richard L. Lewis Department

More information

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017 RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science

More information

Determining the optimal stockpile level for combination vaccines

Determining the optimal stockpile level for combination vaccines Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 10-12-2017 Determining the optimal stockpile level for combination vaccines Sheetal Aher ssa8811@rit.edu Follow

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

Support system for breast cancer treatment

Support system for breast cancer treatment Support system for breast cancer treatment SNEZANA ADZEMOVIC Civil Hospital of Cacak, Cara Lazara bb, 32000 Cacak, SERBIA Abstract:-The aim of this paper is to seek out optimal relation between diagnostic

More information

Inference Methods for First Few Hundred Studies

Inference Methods for First Few Hundred Studies Inference Methods for First Few Hundred Studies James Nicholas Walker Thesis submitted for the degree of Master of Philosophy in Applied Mathematics and Statistics at The University of Adelaide (Faculty

More information

Cost-utility of initial medical management for Crohn's disease perianal fistulae Arseneau K O, Cohn S M, Cominelli F, Connors A F

Cost-utility of initial medical management for Crohn's disease perianal fistulae Arseneau K O, Cohn S M, Cominelli F, Connors A F Cost-utility of initial medical management for Crohn's disease perianal fistulae Arseneau K O, Cohn S M, Cominelli F, Connors A F Record Status This is a critical abstract of an economic evaluation that

More information

Generating Reward Functions using IRL Towards Individualized Cancer Screening

Generating Reward Functions using IRL Towards Individualized Cancer Screening Generating Reward Functions using IRL Towards Individualized Cancer Screening Panayiotis Petousis 1[0000 0002 0696 608X], Simon X. Han 1[0000 0002 1001 4727], William Hsu 1,2[0000 0002 5168 070X], and

More information

An Empirical and Formal Analysis of Decision Trees for Ranking

An Empirical and Formal Analysis of Decision Trees for Ranking An Empirical and Formal Analysis of Decision Trees for Ranking Eyke Hüllermeier Department of Mathematics and Computer Science Marburg University 35032 Marburg, Germany eyke@mathematik.uni-marburg.de Stijn

More information

Summary HTA. HTA-Report Summary

Summary HTA. HTA-Report Summary Summary HTA HTA-Report Summary Prognostic value, clinical effectiveness and cost-effectiveness of high sensitivity C-reactive protein as a marker in primary prevention of major cardiac events Schnell-Inderst

More information

A Game Theoretical Approach for Hospital Stockpile in Preparation for Pandemics

A Game Theoretical Approach for Hospital Stockpile in Preparation for Pandemics Proceedings of the 2008 Industrial Engineering Research Conference J. Fowler and S. Mason, eds. A Game Theoretical Approach for Hospital Stockpile in Preparation for Pandemics Po-Ching DeLaurentis School

More information

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments

Challenges in Developing Learning Algorithms to Personalize mhealth Treatments Challenges in Developing Learning Algorithms to Personalize mhealth Treatments JOOLHEALTH Bar-Fit Susan A Murphy 01.16.18 HeartSteps SARA Sense 2 Stop Continually Learning Mobile Health Intervention 1)

More information

BayesOpt: Extensions and applications

BayesOpt: Extensions and applications BayesOpt: Extensions and applications Javier González Masterclass, 7-February, 2107 @Lancaster University Agenda of the day 9:00-11:00, Introduction to Bayesian Optimization: What is BayesOpt and why it

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL 1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across

More information