Breast cancer survival, competing risks and mixture cure model: a Bayesian analysis

Size: px
Start display at page:

Download "Breast cancer survival, competing risks and mixture cure model: a Bayesian analysis"

Transcription

1 J. R. Statist. Soc. A (2010) 173, Part 2, pp Breast cancer survival, competing risks and mixture cure model: a Bayesian analysis Sanjib Basu Northern Illinois University, Dekalb, USA and Ram C. Tiwari National Cancer Institute, Bethesda, USA [Received June Final revision June 2009] Summary. Cancer is a major public health burden and is the second leading cause of death in the USA. The US National Cancer Institute estimated overall costs of cancer in 2007 at $219.2 billion. Breast cancer has the highest cancer incidence rates among women and is the second leading cause of cancer death among women. The Surveillance, epidemiology, and end results programme of the National Cancer Institute collects and publishes cancer survival data from 17 population-based cancer registries. The CANSURV software of the National Cancer Institute analyses cancer survival data from the programme by using parametric and semiparametric mixture cure models. Another popular approach in cancer survival is the competing risks approach which considers the simultaneous risks from cancer and various other causes. The paper develops a model that unifies the mixture cure and competing risks approaches and that can handle the masked causes of death in a natural way. Markov chain sampling is used for Bayesian analysis of this model, and modelling and computational issues of general and restricted structures are discussed. The various model structures are compared by using Bayes factors. This Bayesian model is used to analyse survival data for the approximately breast cancer cases from the programme. The estimated cumulative probabilities of death from breast cancer from the proposed mixture cure competing risks model is found to be lower than the estimates that are obtained from the CANSURV software. Whereas the estimate of the cure fraction is found to be dependent on the modelling assumptions, the survival and cumulative probability estimates are not sensitive to these assumptions. Breast cancer survival in different ethnic subgroups, in different age subgroups and in patients with localized, regional and distant stages of the disease are compared. The risk of mortality from breast cancer is found to be the dominant cause of death in the beginning part of the follow-up whereas the risk from other competing causes often became the dominant cause in the latter part.this interrelation between breast cancer and other competing risks varies among the different ethnic groups, the different stages and the different age groups. Keywords: Bayes factor; Breast cancer; Competing risks; ; Cure rate; Mixture cure model; Model comparison; Surveillance, epidemiology, and end results programme 1. Introduction Cancer is a major public health problem in the USA and is the second leading cause of death, resulting in projected cancer deaths (more than 1500 a day) and 1.4 million new cancer cases in 2008 (American Cancer Society, 2008). US men have slightly less than a 1 in 2(44.08%) lifetime risk of developing cancer; for women, the risk is little more than 1 in 3 (37.52%; American Address for correspondence: Sanjib Basu, Division of Statistics, Northern Illinois University, Dekalb, IL 60115, USA. basu@niu.edu 2009 Royal Statistical Society /10/173307

2 308 S. Basu and R. C. Tiwari Cancer Society (2008)). The 5-year relative survival rate for all cancers diagnosed between 1996 and 2003 is 66%, up from 50% in The improvement in survival reflects progress in diagnosing certain cancers at an earlier stage and improvements in treatment. The treatment of cancer has shown substantial progress with an increasing proportion of patients being cured from many cancers. The chance of being cured and the survival time since diagnosis are of interest to cancer patients and the medical community alike. From a statistical perspective, when the cause-specific survival curve for cancer tends to plateau at a value that is strictly greater than 0, it is taken as an indication of the presence of a proportion of cured patients for whom cancer will not recur. Fig. 1 shows the cause-specific survival estimates for breast cancer patients with regional stage tumours. The survival curve for black patients levels off beyond 250 months, suggesting the existence of a proportion of patients who are cured from breast cancer. The corresponding curve for white patients, however, does not appear to level off. Lambert et al. (2007) noted that this definition of cure is from a population perspective and is different from medically cured at the individual level. The probability of cure, which is commonly referred to as the cure rate or the surviving fraction, is defined as the asymptotic value of the cancer survival function F.t/as t. Various models have been proposed to analyse survival data with cured individuals. The mixture cure model of Boag (1949) (see also Berkson and Gage (1952) and Maller and Zhao (1996)) assumes that a fraction of the patients are cured from cancer and will never experience the event (death or incidence) from cancer. Gamel et al. (2000), Yu et al. (2004) and Yu and Tiwari (2007) modelled the relative survival from cancer by using the mixture cure model. We consider the risk from other causes as competing risks, use the subdistribution approach to competing risks (Prentice et al., 1978; Crowder, 2001) and explicitly model the subhazard from cancer as well as the subhazard from other causes. We argue that the cured patients are exposed only to the risk from other causes whereas the uncured patients are exposed to both hazards. The term cancer is used generically here to denote risk from a specific cancer, such as breast, or prostate or other cancers. The Surveillance, epidemiology, and end results (SEER) programme of the US National Cancer Institute ( collects and publishes cancer incidence and survival data from 14 population-based cancer registries and three supplemental registries covering approximately 26.2% of the US population. The SEER database contains specific information on primary and (if any) secondary cancer(s) for each patient as well as cause-of-death information for non-survivors; the cause of death can be different from the primary cancer. The National Cancer Institute utilizes the CANSURV software to analyse survival data from the SEER programme. CANSURV (Yu et al., 2005) can fit mixture cure models to population-based grouped survival data by using likelihoodbased methods but does not currently handle competing risks. Fig. 2 provides the CANSURV estimates of cumulative probability of death from cancer by using cause-specific survival data Survival since diagnosis Fig. 1. Survival curve estimates for black ( ) and white ( ) breast cancer patients in the SEER programme with regional tumour

3 Breast Cancer Survival Survival since diagnosis Fig. 2. Estimate of the cumulative probability of death from cancer (... ) based on cause-specific survival data as obtained from the CANSURV software and estimates of the cumulative probability of death from breast cancer ( ) and other causes ( ) based on the methodology proposed from the SEER programme and also a preview of results from the methodology that is proposed in this paper and how they differ from the CANSURV results. We note that CANSURV provides a substantially higher estimate of probability of death from breast cancer since it does not explicitly incorporate the competing risk from other causes. In the SEER database, the causes of death for a non-negligible fraction of the cancer patients are marked as unknown, owing to lack of identification or recording. This is known as partial masking in the competing risks literature (Goetghebeur and Ryan, 1995; Lu and Tsiatis, 2001). We pursue the data augmentation or imputation approach that was described in Basu et al. (2003) for these masked cases. This paper is organized as follows. In Section 2, we describe the binary mixture cure rate model and the competing risks framework using the cause-specific hazards for cancer and other causes. Bayesian analysis and the Markov chain sampling scheme are detailed in Section 3. The modelling and computational issues with the proposed general and restricted model structures are discussed in Section 4. Section 5 details the computation of the marginal likelihood and the Bayes factor. Section 6 considers application to breast cancer survival data from the SEER programme where we compare various models, compare the survival of different ages, SEER historic stage and ethnic subgroups and consider an application with three competing risks, namely from breast cancer, other cancers and other causes. The paper concludes with a brief discussion in Section 7. The methodological objective of this paper is to develop a flexible model that unifies the mixture cure and competing risks approaches and which handles the masked causes of death in a natural way. This is motivated by the problem of survival rate estimation for female breast cancer patients and the question of examining and gaining further insight into the interrelation between breast cancer and other competing risks. 2. The mixture cure model and competing risks 2.1. Competing risks The cure rate or surviving fraction is the limiting value of the cause-specific survival function from cancer. Subjects who die from causes other than cancer are considered censored in causespecific survival estimation and they play an important role in cure rate models. In particular, this is a natural setting for explicit consideration of competing risks from other causes. Analysis of competing risks, of course, has an extensive literature (see Crowder (2001) and Pintilie (2006)). The early approach (David and Moeschberger, 1978) to competing risks analysis is the latent approach which models the potential or latent survival time from cancer (and other causes respectively) when all other risks are removed from the patient. We consider the formulation of Prentice et al. (1978); in particular, we use h.ca, t/ and h.ot, t/ to denote respectively the subhazards

4 310 S. Basu and R. C. Tiwari from cancer (CA) and other causes (OT). (Here, the subhazard h.ca, t/ = lim δ 0 {P.cause = CA, T t + δ T>t/}, which is also sometimes referred to as the cause-specific hazard.) In a general setting, we consider the case of R competing risks, with the subhazard from cause r denoted by h.r, t/. For example, Table 1, lists the causes of death of breast cancer patients from the 17 SEER registries. In addition to the deaths that were reported from breast cancer; out of the deaths reported under other causes, 25279, and were respectively due to competing risks of other cancers, heart and related diseases, and other remaining causes. Gamel et al. (2000), Yu et al. (2004), Yu and Tiwari (2007) and Lambert et al. (2007) considered the relative survival approach to the mixture cure model. The relative survival is the same as the net survival from cancer if expected survival (from other causes) in the general population is assumed to be the same as the net survival from other causes for cancer patients and if the risks of cancer and other causes are assumed to act independently. This independence assumption is implicit in many works that use the relative survival approach; without this assumption, relative survival can be interpreted only as a ratio (Hakulinen and Tenkanen, 1987). The independence assumption, however, is not testable within a fully competing risks framework (without further structural or parametric assumptions) owing to issues of identifiability (Tsiatis, 1975; Crowder, 2001; Kalbfleisch and Prentice, 2002). If independence holds, it follows that the marginal hazards equal the subhazards. The equality of marginal and subhazards is, in fact, weaker than the independence assumption (Crowder, 2001) and is often known as the Makeham assumption (Gail, 1975). Under this assumption, the cause-specific survival functions completely determine the net survival and hence the relative survival functions. We use.t, C/ to denote a survival datum from the SEER registries, where T is the time to event and C denotes the cause of event with C = 1,...,R denoting the R competing risks (C = 1 denotes the primary cancer under study). We use C = 0 to denote the case when an observation is right censored (lost to follow-up) and assume that the censoring process is non-informative. Finally, the case of C = R + 1 denotes the case of an observed death whose cause is unknown or not recorded; more about this case is discussed in Section 2.2. We employ the mixture cure model assuming that a fraction c of the cancer patient population is immune or cured from cancer whereas the remaining fraction of uncured patients is at risk from cancer and a cancer event would be observed with certainty if complete follow-up were possible. For a general formulation of a cure model that encompasses both the mixture cure model as well as the alternative bounded cumulative hazard model, see Cooner et al. (2007). Let Table 1. Breast cancer cases in 17 SEER registries Cause of death Number of cases Breast cancer Other cancers Heart and related diseases Other remaining causes Other causes Unknown 8275 Alive Total

5 Breast Cancer Survival 311 Q be a latent binary indicator, with Q = 1 and Q = 0 denoting the cured and non-cured cases respectively. The joint likelihood of observed data.t i, C i / n i=1 for n patients can then be written as R+1 r=0 i:c i =r p.c i = r, t i / = R+1 r=0 i:c i =r {.1 c/p.c i = r, t i Q i = 0/ + cp.c i = r, t i Q i = 1/}:.1/ 2.2. Cause of death and masking The cause of death is coded in the SEER database according to the international classification of diseases (ICD). SEER registries collect information on all incident cancer cases diagnosed in residents in their catchment area. Each registry periodically receives information from their local vital statistics departments on all deaths. A cause of death is assigned in the SEER programme with a standardized decision algorithm that was outlined in US Department of Health and Human Services (1981). For those cases who died during , the causes of death were coded according to ICD version 8; subsequent deaths are coded according to ICD version 9 (US Department of Health and Human Services, 2000). Accurate assessment of cause of death is important for estimating cause-specific survival from cancer. Lloyd-Jones et al. (1998), Smith and Hutchins (2001), Flanders (1992), Hoel et al. (1993), Modelmog et al. (1992), Grulich et al. (1995) and others investigated the accuracy of death certificates for coding underlying cause of death in the Framingham heart study and other similar studies. Investigations into the accuracy of cause of death in the SEER registries, in particular, generally found strongly positive results. Penson et al. (2001) investigated whether the SEER programme reported cause of death agreed with an independent review of medical records in a sample of prostate cancer patients under the Seattle Puget Sound SEER Cancer Registry and reported 97% agreement between SEER and clinician-assigned causes of death. One of the most comprehensive studies was done by Percy et al. (1990), who assessed the accuracy by comparing the primary cancer site reported on the hospital diagnosis with the reported underlying cause of death. For lung, prostate and colorectal cancers, they reported confirmation rates of 94.3%, 98.1% and 95.6% respectively. For breast cancer in particular, Percy et al. (1990) reported an impressively strong confirmation rate of 98.7%. Harlan and Hankey (2003) discussed that the availability of data from the SEER registries allows current and future innovative research and emphasized that a quality control programme is conducted by the National Cancer Institute each year to evaluate the quality and completeness of the SEER data. A quality improvement process has been an integral part of the SEER programme; see The cause of death for cancer patients is sometimes not identified or recorded; this is known as (partial) masking in the competing risks literature (Goetghebeur and Ryan, 1995; Lu and Tsiatis, 2001; Basu et al., 2003). About 1.3% of the breast cancer patients in Table 1 have their cause of death recorded as unknown. This unknown group consists of the reported ICD codes for causes of death that are deemed invalid (in the quality control validity test in the SEER database) as well as those reported deaths for which either the death certificate or the cause of death is not available. Early works on competing risks with masked causes include Dinse (1982) and Kodell and Chen (1987), who focused on non-parametric estimation of survival functions associated with the cause of interest as well as the other causes. Racine-Poon and Hoel (1984) formally accounted for the uncertainty in diagnosing the exact cause of death by assigning a score on the diagnostic probability being correct. Dinse (1986) obtained non-parametric maximum likelihood estimates of cause-specific hazards in the setting of two competing risks. Lagakos and Louis (1988) investigated various non-parametric tests for comparing differences

6 312 S. Basu and R. C. Tiwari in survival functions for two groups under masking. Goetghebeur and Ryan (1995) developed a proportional hazards structure in the context of two competing risks under the assumption that the baseline hazards for the two competing risks are proportional. Lu and Tsiatis (2001) utilized multiple imputation to generalize this to the case when the baselines are not proportional. Flehinger et al. (1998) proposed maximum likelihood estimation under a similar model. Dewanji and SenGupta (2003) developed an EM algorithm in the setting of masked but grouped survival data. Lu and Tsiatis (2005) compared the two partial likelihood approaches in masked data. For simplicity of notation, we have earlier used C = R + 1 to denote the case when death is recorded but the cause of death is unknown and hence can be any of the C = 1,...,R causes. We then have p.c = R + 1, t/ = R p.c = r, t/:.2/ This formulation makes the implicit symmetry assumption that the probability that the cause of death is masked does not depend on the actual (but unknown) cause of death or the survival time (Crowder (2001), page 35), i.e. P.cause being masked C = r 1, T = t/ = P.cause being masked C = r 2, T = s/,.3/ for all r 1, r 2 {1,...,R} and 0 <s, t<. Craiu and Reiser (2006), Mukhopadhyay (2006), Mukhopadhyay and Basu (2007), Basu (2009) and others have considered competing risks models without this symmetry assumption. r=1 3. Model and posterior analysis 3.1. Model formulation The subhazard for cause C = r and cure group Q = q is denoted by h.c = r, t Q = q/ = h.r, t q/, r = 1,...,R, q = 0, 1. We assume that cure is only considered for the C = 1 risk (the primary cancer under study) and hence subjects in the cured group (Q = 1) are not exposed to the C = 1 risk. In particular h.c = 1, t Q = 1/ = 0, t 0. We assume that h.r, t q/is indexed by parameter θ qr. It is a common practice to model the hazard function as a piecewise constant function on a grid. In such a case, we have h.r, t q/ = D λ qrd I [td 1,t d /.t/, d=1 where 0=t 0 <t 1 <...<t D = is a prespecified partition of [0, /, λ qrd is the value of the hazard function in the interval [t d 1, t d / and θ qr =.λ qr1,...,λ qrd /. In other cases, the parameter θ can be a location scale pair, θ =.μ, τ/; this location scale family includes a large family of distributions that are traditionally used to model survival data (when the survival time is suitably transformed, say, to a log-survival time) and includes the exponential, Weibull, log-normal, log-logistic and Gumbel distributions. The gamma, Gompertz and three-parameter generalized gamma distribution (see Yu et al. (2004)), which does not have the location scale structure, can also be considered within this structure. We do not require the hazards to be the same for the various risks r = 1,...,R, i.e. the hazard model can be different for the R different risks. The parameters in this model are θ 0 =.θ 0r, r = 1,...,R/, θ 1 =.θ 1r, r = 2,...,R/ and the cure fraction c. Let p.θ 0, θ 1, c/ be their joint prior density. Under prior independence, which we assume in the applications, p.θ 0, θ 1, c/ = p c.c/ p 01.θ 01 / R p 0r.θ 0r /p 1r.θ 2r /: r=2

7 Breast Cancer Survival 313 In the special case when the same functional form is assumed for the R risks, the parameters have similar interpretations, and in such a case the same prior functional forms p 0.θ 0r / and p 1.θ 1r / can be utilized for all risks. As an example, suppose that the hazard from cause 1 (primary cancer) for the uncured patients is Weibull, i.e. h.c = 1, t Q = 0/ = τ 01 exp[τ 01 {log.t/ μ 01 }],.4/ where θ 01 =.μ 01, τ 01 / and τ 01 is the precision or inverse scale parameter for the log-weibull (or smallest extreme value) distribution. Basu et al. (2003) showed that a conjugate prior for μ 01 is given by a log-gamma.ν 01, ξ 01 / distribution. The precision τ 01 may have an independent prior p.τ 01 /. One can similarly specify the functional forms of the other cause-specific hazards and the corresponding priors. Finally, a beta.α, β/ distribution provides a flexible as well as conditionally conjugate prior for the cure fraction c Posterior analysis Bayesian analysis of this model is performed by Markov chain sampling which, now, has a very extensive literature. Basu et al. (2003) utilized Markov chain sampling for Bayesian analysis of competing risks. We treat the binary cure status indicator Q i as a latent variable in the Markov chain sampling. The introduction of latent or auxiliary variables within Markov chain sampling is quite common and was studied in Tanner and Wong (1987), among others. As we noted earlier, {C i = 1} and {Q i = 1} are mutually exclusive, i.e. if the cause of death is risk 1 then the person cannot be cured (from risk 1). For those patients for whom the cause of death is not completely identified or recorded, we introduce a latent cause-of-death indicator C i taking values within the subset S i of possible causes. It is imperative that, if the cause of death is known, then C i assumes that single value with probability 1. A similar data augmentation approach was explored in Basu et al. (2003); Lu and Tsiatis (2001) used the EM method to impute the cause of failure within a non-bayesian approach. With the introduction of these latent variables, for each patient, we have survival time t i, cause C i (with C i = 0 denoting right censoring) and cure status Q i (equal to 0 or 1), and the joint distribution of the observables and the unobservables (latent variables as well as parameters) is nowgivenby p.c, t, Q, θ 0, θ 1, c/ = p.θ 0, θ 1, c/ [ { ch.c i, t i Q i = 1, θ 1Ci / I.Ci 0/ exp R i:q i =1 i:q i =0 r=2 [ {.1 c/h.c i, t i Q i = 0, θ 0Ci / I.Ci 0/ exp }] H.r, t i Q i = 1, θ 1r / R r=1 }] H.r, t i Q i = 0, θ 0r / : The posterior distribution of the parameters θ 0, θ 1 and c as well as the full conditional distributions of the complete set or subset of the unobservables are proportional to the joint distribution in equation (5). In our applications, we have generally used a block Markov chain sampler that sampled θ =.θ 0, θ 1 / in a single multivariate block via a random-walk Metropolis sampler, and then sampled the cure fraction c and the latent variables from their full conditional distributions. In a rather special case, when the hazard (for a particular risk, and either for the cured or uncured case) follows a (log-)weibull structure and the location precision pair θ =.μ, τ/ have independent log-gamma and gamma priors respectively, the full conditional posterior of μ is.5/

8 314 S. Basu and R. C. Tiwari also log-gamma (being conditionally conjugate) and the conditional posterior of τ can be shown to be log-concave (which, then, may be sampled via the adaptive rejection sampler of Gilks and Wild (1992)). From equation (5), the full conditional distribution of the cure fraction c is p.c t, Q, C, θ/ p.θ 0, θ 1, c/ c #{Q i=1}.1 c/ #{Q i=0},.6/ and, if c is a priori independent of the other parameters with a beta.α, β/ prior, then this full conditional distribution is simply beta.α + #{Q i = 1}, β + #{Q i = 0}/. The subject-specific cure status Q i is a binary variable (unless C i = 1, in which case Q i is degenerate at 0) having the full conditional distribution P.Q i = q t, Q i, C, θ, c/ c q.1 c/ 1 q p.c i, t i Q i = q/, q = 0, 1:.7/ Similarly, the cause C i can be imputed within each iteration of the Markov chain sampler for those subjects for whom the cause is not observed by sampling from its full conditional distribution P.C i = r t, Q, C i, θ, c/ p.c i = r, t i Q i /, r S i,.8/ where S i is the subset of possible causes. This posterior analysis is applied to breast cancer survival data from the SEER programme in Section 6 where we report estimates of cumulative probabilities of death. The cumulative probability of death from cause C {1,...,R} at time t is defined as P.C, T t θ/ = = c t 0 t 0 p.c, s θ/ds +.1 c/ { ch.c, s Q = 1, θ 1C /exp R t 0 r=2 { h.c, s Q = 0, θ 0C /exp R } H.r, s Q = 1, θ 1r / ds r=1 } H.r, s Q = 0, θ 0r / ds:.9/ The overall cumulative probability of death is P.T t/ = Σ R r=1p.c = r, T t/. The cause-specific cumulative probability P.C, T t/ is typically estimated by evaluating the integrals in equation (9) numerically in each iteration of the Markov chain sampler. 4. Modelling issues The issue of identifiability has been a topic of extensive discussion in both cure rate and competing risks models. The question of almost non-identifiability (not identifiable within ") of parametric mixture cure models has been examined in Yu et al. (2004); they, however, did not consider competing risks. Identifiability issues in competing risks have been extensively considered; see Tsiatis (1975) and Crowder (1994, 2001). The question of identifiability in our model arises from the fact that the cure group (cured or uncured) memberships are known only for those whose cause of death is risk 1 or the primary cancer (they are uncured); for all others, this group membership is latent. Identifiability is less of a formal problem within the Bayesian paradigm; however, it may cause various operational hurdles. In the Bayesian analysis of the breast cancer survival data, we have run parallel Markov chain samplers from various sets of initial values and have noticed occasional computational issues for some choices of initial values. One such case is illustrated in Fig. 3 where the log-likelihood trajectory stayed

9 Breast Cancer Survival 315 Log likelihood Cure Iterations (a) Iterations (b) Cure Cure tauc (c) tauc (d) Fig. 3. (a) Log-likelihood trajectory, (b) cure fraction trajectory, (c) log-likelihood contours, with the remaining parameters fixed at the median values of their Markov chain draws for iterations, and (d) log-likelihood contours, with the remaining parameters fixed at the median values of their Markov chain draws for iterations stabilized and flat up to about 6000 iterations and then increased to a higher value. The cure fraction trajectory also stayed stabilized close to 0 up to 6000 iterations. It then increased rapidly and finally stabilized around In fact, other parameters (which are not shown here) also depicted similar behaviour. Such patterns strongly suggest the existence of multiple local modes in the posterior and hence in the likelihood. To gain further insight, we have explored the likelihood surfaces; these are shown Figs 3(c) and 3(d), which show the log-likelihood contours

10 316 S. Basu and R. C. Tiwari against the precision parameter τ C in the cure group and the cure fraction c. In Fig. 3(c), the remaining parameters are fixed at their respective median values obtained from iterations of the sampler whereas these parameters are fixed at their post-7000-iterations median values in Fig. 3(d). These likelihood contours clearly show that there is a local maximum of the likelihood near c = 0; however, the other maximal value near c = 0:65 is higher. We have modelled the hazard h.c = r, t Q = 1/ in the cure group independently of the hazard h.c = r, t Q = 0/ in the uncured group, r = 2,...,R, and the observed computational issue may possibly be an artefact of the identifiability issues of this model. In fact, non-identifiability in this model can be established in specific cases when the cause-specific hazards are modelled completely non-parametrically; this will be reported elsewhere. We have also considered some alternative, more parsimonious, model choices which may alleviate these concerns. One such alternative is, what we call, the equal hazards model where we assume that the subhazards are the same in the cured and uncured groups, i.e. h.c = r, t Q = 1/ = h.c = r, t Q = 0/ = h.r, t/, r = 2,...,R:.10/ This model thus allows borrowing strength across cured and uncured groups and is expected to alleviate the identifiability concerns. We have moreover considered another alternative model that maintains the general structure but the joint prior p.θ, c/ is constrained to have support only on {.θ, c/ : R h.r, t Q = 1, θ/ R } h.r, t Q = 0, θ/, for all t 0,.11/ r=2 r=1 i.e. we explicitly model the overall hazard in the cured group to be lower than the overall hazard in the uncured group. We call this the ordered hazards model. This is similar to the idea of imposing order constraints in a mixture model to bypass the label switching problem. Note that, under the equal hazards model (10), the hazard ordering of expression (11) is immediately satisfied as the overall hazard in the uncured group then equals Σ R r=1h.r, t/ whereas the hazard in the cured group is simply Σ R r=2h.r, t/. Another possibility here is to consider ordering in terms of survival functions, i.e. S.t Q = 1, θ/ S.t Q = 0, θ/, thus resulting in a bigger support set since the hazard ordering implies the survival ordering. Ordering of subsurvival functions in competing risks contexts has been considered in El Barmi et al. (2006) and Sun and Tiwari (1997) among others; here, though, we consider ordering between the hazard function choices for the cured and uncured groups. We did not notice any computational issues in the Markov chain samplers for either the equal hazards or the ordered hazards model. The ordered hazards model is implemented by checking the constraint in expression (11) on a finite grid of the time axis. The survival estimates (posterior means) from these three models, namely the general, equal hazards and the ordered hazards model, are shown in Fig. 4 where we report the overall survival estimate as well as the estimates of survival in the latent cured and uncured groups respectively. All three models provide almost identical survival estimates in Fig. 4; we notice only minor differences in the cured group estimates. The equal hazards model assumes that subhazards from other causes are the same in the cured and uncured groups as in equation (10), and it is of interest to examine the implication of this assumption. In case of two risks (i.e. R = 2), one can argue that the subhazard h.c = 2, t Q = 1/ in the cured group is same as the marginal hazard h 2.t/, since the cured patients are exposed only to this hazard. In such a case, equation (10) is equivalent to equality of marginal hazard and subhazard h.2, t/=h 2.t/, but only for cause C =2. The equality of subhazards and marginal hazards

11 Breast Cancer Survival 317 Survival Cured Combined Uncured Fig. 4. Estimates of overall survival, survival in the cured group S C.t/ and survival in the uncured group S U.t/ from the three models:, general; , equal hazards;..., ordered for all causes is known as the Makeham assumption (Gail, 1975); we, however, emphasize that it may apply only to the risk from cause 2 in our case. Even the Makeham assumption only implies independence on the diagonal, i.e. P.T {C=1} >t 1, T {C=2} >t 2 / = P.T {C=1} >t 1 /P.T {C=2} >t 2 / if t 1 = t 2 (Crowder (2001), pages 49 50); a counterexample to complete independence was provided in Williams and Lagakos (1977). 5. Model comparison by using Bayes factors The choice between the general, equal hazards and the ordered models raises the question of model selection. The Bayes factor provides a formal Bayesian model selection criterion. For a specific model M, ifweusep.d φ/ to denote the likelihood of the observed data D given the unobservables collectively denoted as φ and use p.φ/ to denote the (joint) prior distribution of all unobservables, then the marginal likelihood of model M is given by p.d M/ = p.d φ/ dp.φ/:.12/ The Bayes factor for model M 1 against model M 2 is the ratio of the marginal likelihoods, or in the log-scale log.bf 12 / = log{p.d M 1 /} log{p.d M 2 /}. Kass and Raftery (1995) stated that the Bayes factor is a summary of evidence for model M 1 against model M 2 and provided the cut-offs in Table 2 for interpreting log.bf 12 /. We have used Markov chain sampling to draw samples from the joint posterior of the three models. However, since the marginal likelihood in equation (12) is not a direct function of the Table 2. Interpretation of the Bayes factor (Kass and Raftery, 1995) log.bf 12 / Evidence against model M Bare mention 1 3 Positive 3 5 Strong > 5 Very strong

12 318 S. Basu and R. C. Tiwari posterior, its estimation is not immediate from the available Markov chain samples and has been a topic of extensive research in recent years (see Kass and Raftery (1995), DiCiccio et al. (1997) and Basu and Chib (2003)). The marginal likelihood can be expressed as the harmonic (posterior) mean of the likelihood, but the resulting estimator is known to be unstable owing to the occasional draws in the Markov chain sampler which have low likelihood values. We consider the volume-corrected harmonic mean estimator that was proposed in DiCiccio et al. (1997), which restricts the Monte Carlo averaging in a high posterior density region. In fact, we used a combination of the marginal likelihood estimate that was proposed in Chib (1995) with the volume-restricted harmonic estimator. For a beta prior, the conditional posterior of the cure fraction c is available in closed form (owing to conditional conjugacy). This allows us to estimate the posterior ordinate p.c Å D, M/ at an appropriately chosen value c = c Å (with high posterior ordinate) by the Monte Carlo average of the full conditional posterior p.c Å D, C, Q, θ/ over the Markov chain samples. For this fixed c = c Å, we next estimate the conditional marginal likelihood p.d c Å, M/. Since the likelihood can be analytically integrated over the latent cure status and latent cause (for masked causes), evaluation of p.d c Å, M/ requires only integration over θ in equation (12) for fixed c = c Å. We estimate p.d c Å, M/ by using the volume-restricted harmonic mean estimator. The log-marginal likelihood log{p.d M/} is finally estimated by using the basic marginal identity (Chib, 1995) as log{p.d M/} = log{p.d c Å, M/} + log{p.c Å M/} log{p.c Å D, M/}.13/ where p.c Å M/ is simply the prior ordinate. We selected this estimator for marginal likelihood computation because it is simple, does not require extensive additional coding and did not display instability in the examples that we considered. 6. Application to breast cancer survival data 6.1. Breast cancer Breast cancer has the leading incidence in women among all cancers. The American Cancer Society estimated that women would be diagnosed with and women would die of cancer of the breast in The age-adjusted rate of incidence of breast cancer, based on cases diagnosed in from the 17 SEER geographic areas, was per women per year with whites having a higher rate of incidence (134 per ) compared with blacks (118 per ). On the basis of rates from , 12.67% or about 1 in 8 women born today will be diagnosed with cancer of the breast during their lifetime. Breast cancer incidence rates have displayed a significant decrease in recent years (Ravdin et al., 2007; Glass et al., 2007) which is often attributed to the reduced use of hormone replacement therapy following the publication of results from the Womens Health Initiative. Among the breast cancer cases in the 17 SEER registries (Table 1), we identified cases with available survival records with in situ or invasive breast cancer diagnosed between January 1st, 1973, and December 31st, We included only the first diagnosis of breast cancer that was recorded by the SEER programme for each patient and the survival time is measured from the time of diagnosis. The analysis here considers the competing risks from breast cancer and other causes. The next section subdivides the other causes into other cancers and remaining causes and describes analysis with three competing risks. We have used independent normal and gamma priors with large variances for the location and precision parameters and a uniform(0,1) prior for the cure fraction c. We found that the results are not sensitive to the choice of priors. The results that we report in what follows are typically based on iterations after a burn-in of of the Markov chain sampler.

13 Breast Cancer Survival 319 We first compare the general, equal hazards and the ordered models in these data by using the Bayes-factor-based model comparison as described in Section 5. The estimated log-marginal likelihoods for the three models are shown in Table 3. We also estimate the marginal likelihood of the model without a cured group; in this case, of course, the general, equal hazards and ordered hazards models are identical. We note in Table 3 that (a) the cure models are decisively preferred and (b) the general cure model is strongly preferred in these data in comparison with the equal hazards model according to the Bayes factor criterion. The difference between the general model and the ordered hazards model is also very strong according to Table 2. On the basis of the results in Table 3, we have used the general model for the data analyses that are described in what follows. Fig. 5 shows the estimates of overall cumulative probabilities of death, as well as those from breast cancer and other causes. The point estimates reported are the estimated posterior means of the numerical evaluations of the integrals in equation (9). We note in Fig. 5 that the probability of death from breast cancer is the predominant cause of the death up to 180 months (15 years) since diagnosis, at which point the cumulative probability of death from other causes crosses over and becomes the dominant cause of death. This may be partially attributed to the increasing effect of other causes that are associated with old age. We also note that the 95% pointwise Table 3. Estimated log-marginal likelihoods for breast cancer survival data from the SEER programme Model Cure Results for the cure model Log-marginal, estimate non-cure model Log-marginal log(bf) General Ordered Equal hazards The reported Bayes factor values BF are of the general cure model against the comparator cure model P (T<=t) P (T<=t, OT) P (T<=t, CA) Fig. 5. Estimates of the cumulative probability of death: overall, from breast cancer and from other causes (also shown are pointwise 95% credible bands ( ), which are almost indistinguishable from the estimates)

14 320 S. Basu and R. C. Tiwari (a) (b) (c) (d) Fig. 6. Estimates (and pointwise 95% credible bands) of the cumulative probability of death from breast cancer ( ) and other causes ( ) and the estimate of the cumulative probability of death obtained from the CANSURV software (... ): (a) Weibull hazard; (b) log-normal hazard; (c) log-logistic hazard; (d) gamma hazard credible bands are almost indistinguishable from the cumulative probability estimates; this is due to the extensive amount of information that is available in the data. Fig. 6 compares the cumulative probability estimates that were obtained from four parametric models. We also include the cumulative probability of death estimates that were obtained from the CANSURV software. The CANSURV results in Fig. 6 are obtained from cause-specific survival data from cancer (where causes of death other than cancer are recorded as censored).

15 Breast Cancer Survival 321 Also, the CANSURV estimate is missing in Fig. 6(d) since the gamma hazard is not currently available in CANSURV. We note in Fig. 6 that all four parametric models provide quite similar estimates, thus illustrating the relative insensitivity of these survival estimates to the choice of the model. We further note that the CANSURV estimates of cumulative probability of cancer, which does not utilize a competing risks structure, are, in general, substantially higher than the estimates that are provided by our model using the combined competing risks and mixture cure structure. This is clearly important since estimates from CANSURV may be used in making public policy decisions. The insensitivity of estimates to the choice of the parametric model, however, is not maintained once we consider parameters that are specific to the cured and uncured components. Table 4 lists the cure fraction estimates from the four parametric models. We note that these estimates strongly depend on the choice of the model. This phenomenon is, in fact, well documented in the literature (see Yu et al. (2004)). In general, the cure fraction estimate depends on the right-hand tail of the survival distribution where the parametric models may differ in their tail behaviour. We consider a formal Bayesian model comparison among these four parametric models using Bayes factors and, for each, we compare the mixture cure model with a competing risks model without a cure fraction or a cured group. The estimated log-marginal likelihoods of the four parametric models are shown in Table 4. We note in Table 4 that the log-normal model has rather different cure fraction estimate and Bayes factor value. The log-normal distribution is known to have a different tail behaviour from those of the other distributions. The log-normal model also has the lowest marginal likelihood value among the cure models, with the log-logistic model having the highest value Breast cancer survival: three competing risks We revisit the survival data for all the cases of breast cancer in the SEER data and consider the competing risks from breast cancer, other cancers and all remaining causes; the last group includes all risks that are not included in the first two. Fig. 7 shows the estimates of the cumulative probabilities of death from these three competing risks (the cumulative probability from each risk is defined similarly to equation (9) and estimated by numerical integration and Monte Carlo averages over the Markov chain samples). These estimates are presented here in a different way where we have used the cumulative probabilities of death from breast cancer, other Table 4. Estimated log-marginal likelihoods for breast cancer survival data from the SEER programme Model Cure Log-marginal-likelihood log(bf) estimate Cure model Non-cure model Weibull Log-normal Log-logistic Gamma The Bayes factor values BF compare the cure and non-cure models in the same row.

16 322 S. Basu and R. C. Tiwari Survival Other Causes Other Cancers Breast Cancer Fig. 7. Estimates of survival and cumulative probabilities of death from the three-competing-risks model cancers and other remaining causes, and the probability of survival to partition the probability interval [0,1] and plotted the estimated trajectories of this partition over the duration of the follow-up. We note as before that the risk from breast cancer is the predominant cause of death in the earlier part of the follow-up whereas the risk from other causes is the dominant cause in the latter half of the follow-up. However, the risk from other causes, which now excludes the risks from other cancers, becomes the dominant cause after about 243 months (about 20 years) of follow-up since diagnosis. In comparison, in the two-competing-risks analysis in Fig. 5, the risk of death from other causes (which also included the risks from other cancers) became the dominant cause of death after only 180 months (15 years) Breast cancer: subgroup analyses For female breast cancer, many important risk factors are well known. Age is considered to be the most important factor affecting risk of breast cancer. Risk is also affected by ethnicity, family history, inherited genetic mutations in the BRCA1 and BRCA2 breast cancer genes, high breast tissue density, never having children, having one s first child after the age of 30 years and many other factors. Here we consider subgroup analyses based on three such factors, namely age at diagnosis, race and stage of disease. These subgroup analyses are similar in spirit to those reported in Schairer et al. (2004), Ravdin et al. (2007) and Yu et al. (2004). We categorized the breast cancer cases in the SEER programme according to race (Asian or Pacific islanders, black or white), age at diagnosis (less than 50, 50 59, or 70 years or over) and the SEER historic stage. We consider localized, regional and distant stages as recorded in the SEER programme. Localized disease pertains to malignant tumour confined to the breast tissue and fat. Regional disease pertains to tumour that has extended beyond the limits of the breast directly into surrounding organs or tissues, or tumour that involves or has extension into regional lymph nodes. Distant disease refers to malignant neoplasms that have spread to parts of the body that are remote from the breast either by direct extension or by discontinuous metastasis, or via the lymphatic system to distant lymph nodes. This staging scheme is available for the entire duration of Fig. 8 shows the overall survival and cumulative probabilities of death for the Asian or Pacific islanders, black and white breast cancer cases. These results are based on Asian or Pacific islanders, black and white cases in the SEER programme. We note that, whereas

17 Breast Cancer Survival 323 Survival probability (a) (b) (c) (d) Fig. 8. Estimates of cumulative probabilities of death for various ethnic groups and pointwise 95% credible bands: (a) overall survival ( - -, Asian; , black;, white); (b) Asians or Pacific islanders (, P.T <t, cancer/; , P.T <t, other/); (c) blacks (, P.T <t, cancer/; , P.T <t, other/); (d) whites (, P.T <t, cancer/; , P.T <t, other/) the black cases have relatively lower overall survival throughout the follow-up, the overall survival for blacks and whites become almost the same towards the end of the follow-up whereas the Asians and Pacific islanders have a uniformly higher survival rate. As in the overall population (Fig. 5), the cumulative probability of death from other causes crosses over and becomes the dominant risk at 142 months in white patients. For the Asians and Pacific islanders, this crossover happens later at 237 months. The Asians and Pacific islander breast cancer patients also have

18 324 S. Basu and R. C. Tiwari Table 5. Cure fraction estimates Localized Regional Distant Asian or Black White Pacific islander relatively lower probabilities of death from both cancer and other causes compared with the white and black cases. The same trend is seen in the cure fraction estimates in Table 5. In black patients, however, the cumulative probability of death from cancer remains the dominant risk for the majority of the follow-up period and the risk from other causes crosses over only after 304 months. White patients are estimated to have, in general, lower risk of death from breast cancer (in comparison with black patients) but have a relatively higher risk of death from other causes. Fig. 9 shows the estimated overall survival and cumulative probabilities for patients with localized, regional and distant stages of breast cancer at the time of diagnosis. This is based on localized cases, regional cases and distant cases whose survival records are available in the SEER programme. The survival estimate for patients with distant stage cancer shows a sharp drop in survival in the beginning part of follow-up; the rates of decrease for local and regional stages are more gradual. The cumulative probability of death from other causes completely dominates for the localized stage, whereas, on the opposite side of the spectrum, the probability of death from cancer rises sharply and completely dominates in the distant stage. In the regional stage also, we note that the risk of death from cancer rises sharply in the first half of follow-up and again completely dominates over the risk of other causes. The trend of increased risk for the regional cases and substantially increased risk for the distant cases are also seen in the cure fraction estimates in Table 5. Fig. 10 shows the estimated cumulative probabilities of death from breast cancer and other causes for black and white breast cancer patients according to age at diagnosis subgroups (under 50, 50 59, and 70 years and over) and SEER historic stage (localized, regional and distant disease). A similar analysis was reported in Schairer et al. (2004), who considered competing risks but not a mixture cure model and used non-bayesian analysis. Among both black and white patients with localized disease, death from breast cancer is the predominant cause of death over the almost entire study period for women who were diagnosed before age 50 years. For those who were diagnosed between ages 50 and 59 years, breast cancer is the predominant cause of death for almost half of the follow-up duration. For those who were diagnosed at or after age 60 years, the probability of death from other causes became increasingly important. For those with regional disease (both white and black) who were diagnosed before age 60 years, breast cancer is the dominant cause of death for the entire follow-up whereas it is the dominant cause for almost the complete follow-up period for those who were diagnosed at ages years. Among women who were diagnosed at ages 70 years and over, the risks of death from breast cancer and other causes are similar in the beginning years, after which time the risk from other causes dominates in both black patients and white patients. Among both black and white patients with distant disease, mortality from breast cancer is the overwhelming cause of death regardless of age at diagnosis. We also note that, in both regional and distant stages and all four age groups, breast cancer mortality sharply increases in the beginning years after diagnosis and then gradually becomes flat.

Application of EM Algorithm to Mixture Cure Model for Grouped Relative Survival Data

Application of EM Algorithm to Mixture Cure Model for Grouped Relative Survival Data Journal of Data Science 5(2007), 41-51 Application of EM Algorithm to Mixture Cure Model for Grouped Relative Survival Data Binbing Yu 1 and Ram C. Tiwari 2 1 Information Management Services, Inc. and

More information

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers Dipak K. Dey, Ulysses Diva and Sudipto Banerjee Department of Statistics University of Connecticut, Storrs. March 16,

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Biological Cure. Margaret R. Stedman, Ph.D. MPH Angela B. Mariotto, Ph.D.

Biological Cure. Margaret R. Stedman, Ph.D. MPH Angela B. Mariotto, Ph.D. Using Cure Models to Estimate Biological Cure Margaret R. Stedman, Ph.D. MPH Angela B. Mariotto, Ph.D. Data Modeling Branch Surveillance Research Program Division of Cancer Control and Population Sciences

More information

Introduction to Bayesian Analysis 1

Introduction to Bayesian Analysis 1 Biostats VHM 801/802 Courses Fall 2005, Atlantic Veterinary College, PEI Henrik Stryhn Introduction to Bayesian Analysis 1 Little known outside the statistical science, there exist two different approaches

More information

Report on Cancer Statistics in Alberta. Kidney Cancer

Report on Cancer Statistics in Alberta. Kidney Cancer Report on Cancer Statistics in Alberta Kidney Cancer November 29 Surveillance - Cancer Bureau Health Promotion, Disease and Injury Prevention Report on Cancer Statistics in Alberta - 2 Purpose of the Report

More information

Report on Cancer Statistics in Alberta. Breast Cancer

Report on Cancer Statistics in Alberta. Breast Cancer Report on Cancer Statistics in Alberta Breast Cancer November 2009 Surveillance - Cancer Bureau Health Promotion, Disease and Injury Prevention Report on Cancer Statistics in Alberta - 2 Purpose of the

More information

Introduction. Patrick Breheny. January 10. The meaning of probability The Bayesian approach Preview of MCMC methods

Introduction. Patrick Breheny. January 10. The meaning of probability The Bayesian approach Preview of MCMC methods Introduction Patrick Breheny January 10 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/25 Introductory example: Jane s twins Suppose you have a friend named Jane who is pregnant with twins

More information

Report on Cancer Statistics in Alberta. Melanoma of the Skin

Report on Cancer Statistics in Alberta. Melanoma of the Skin Report on Cancer Statistics in Alberta Melanoma of the Skin November 29 Surveillance - Cancer Bureau Health Promotion, Disease and Injury Prevention Report on Cancer Statistics in Alberta - 2 Purpose of

More information

Issues for analyzing competing-risks data with missing or misclassification in causes. Dr. Ronny Westerman

Issues for analyzing competing-risks data with missing or misclassification in causes. Dr. Ronny Westerman Issues for analyzing competing-risks data with missing or misclassification in causes Dr. Ronny Westerman Institute of Medical Sociology and Social Medicine Medical School & University Hospital July 27,

More information

An Overview of Survival Statistics in SEER*Stat

An Overview of Survival Statistics in SEER*Stat An Overview of Survival Statistics in SEER*Stat National Cancer Institute SEER Program SEER s mission is to provide information on cancer statistics in an effort to reduce the burden of cancer among the

More information

LOGO. Statistical Modeling of Breast and Lung Cancers. Cancer Research Team. Department of Mathematics and Statistics University of South Florida

LOGO. Statistical Modeling of Breast and Lung Cancers. Cancer Research Team. Department of Mathematics and Statistics University of South Florida LOGO Statistical Modeling of Breast and Lung Cancers Cancer Research Team Department of Mathematics and Statistics University of South Florida 1 LOGO 2 Outline Nonparametric and parametric analysis of

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

Bayesian and Frequentist Approaches

Bayesian and Frequentist Approaches Bayesian and Frequentist Approaches G. Jogesh Babu Penn State University http://sites.stat.psu.edu/ babu http://astrostatistics.psu.edu All models are wrong But some are useful George E. P. Box (son-in-law

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions

Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions Bayesian Confidence Intervals for Means and Variances of Lognormal and Bivariate Lognormal Distributions J. Harvey a,b, & A.J. van der Merwe b a Centre for Statistical Consultation Department of Statistics

More information

Fundamental Clinical Trial Design

Fundamental Clinical Trial Design Design, Monitoring, and Analysis of Clinical Trials Session 1 Overview and Introduction Overview Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics, University of Washington February 17-19, 2003

More information

Bayesian Estimations from the Two-Parameter Bathtub- Shaped Lifetime Distribution Based on Record Values

Bayesian Estimations from the Two-Parameter Bathtub- Shaped Lifetime Distribution Based on Record Values Bayesian Estimations from the Two-Parameter Bathtub- Shaped Lifetime Distribution Based on Record Values Mahmoud Ali Selim Department of Statistics Commerce Faculty Al-Azhar University, Cairo, Egypt selim.one@gmail.com

More information

A Case Study: Two-sample categorical data

A Case Study: Two-sample categorical data A Case Study: Two-sample categorical data Patrick Breheny January 31 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/43 Introduction Model specification Continuous vs. mixture priors Choice

More information

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia

Bayesian Joint Modelling of Longitudinal and Survival Data of HIV/AIDS Patients: A Case Study at Bale Robe General Hospital, Ethiopia American Journal of Theoretical and Applied Statistics 2017; 6(4): 182-190 http://www.sciencepublishinggroup.com/j/ajtas doi: 10.11648/j.ajtas.20170604.13 ISSN: 2326-8999 (Print); ISSN: 2326-9006 (Online)

More information

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data

Missing data. Patrick Breheny. April 23. Introduction Missing response data Missing covariate data Missing data Patrick Breheny April 3 Patrick Breheny BST 71: Bayesian Modeling in Biostatistics 1/39 Our final topic for the semester is missing data Missing data is very common in practice, and can occur

More information

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials

Practical Bayesian Design and Analysis for Drug and Device Clinical Trials Practical Bayesian Design and Analysis for Drug and Device Clinical Trials p. 1/2 Practical Bayesian Design and Analysis for Drug and Device Clinical Trials Brian P. Hobbs Plan B Advisor: Bradley P. Carlin

More information

Combining Risks from Several Tumors Using Markov Chain Monte Carlo

Combining Risks from Several Tumors Using Markov Chain Monte Carlo University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln U.S. Environmental Protection Agency Papers U.S. Environmental Protection Agency 2009 Combining Risks from Several Tumors

More information

Overview. All-cause mortality for males with colon cancer and Finnish population. Relative survival

Overview. All-cause mortality for males with colon cancer and Finnish population. Relative survival An overview and some recent advances in statistical methods for population-based cancer survival analysis: relative survival, cure models, and flexible parametric models Paul W Dickman 1 Paul C Lambert

More information

TRIPODS Workshop: Models & Machine Learning for Causal I. & Decision Making

TRIPODS Workshop: Models & Machine Learning for Causal I. & Decision Making TRIPODS Workshop: Models & Machine Learning for Causal Inference & Decision Making in Medical Decision Making : and Predictive Accuracy text Stavroula Chrysanthopoulou, PhD Department of Biostatistics

More information

BREAST CANCER EPIDEMIOLOGY MODEL:

BREAST CANCER EPIDEMIOLOGY MODEL: BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer

More information

Analysis of left-censored multiplex immunoassay data: A unified approach

Analysis of left-censored multiplex immunoassay data: A unified approach 1 / 41 Analysis of left-censored multiplex immunoassay data: A unified approach Elizabeth G. Hill Medical University of South Carolina Elizabeth H. Slate Florida State University FSU Department of Statistics

More information

Case Studies in Bayesian Augmented Control Design. Nathan Enas Ji Lin Eli Lilly and Company

Case Studies in Bayesian Augmented Control Design. Nathan Enas Ji Lin Eli Lilly and Company Case Studies in Bayesian Augmented Control Design Nathan Enas Ji Lin Eli Lilly and Company Outline Drivers for innovation in Phase II designs Case Study #1 Pancreatic cancer Study design Analysis Learning

More information

Estimating and Modelling the Proportion Cured of Disease in Population Based Cancer Studies

Estimating and Modelling the Proportion Cured of Disease in Population Based Cancer Studies Estimating and Modelling the Proportion Cured of Disease in Population Based Cancer Studies Paul C Lambert Centre for Biostatistics and Genetic Epidemiology, University of Leicester, UK 12th UK Stata Users

More information

Estimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little

Estimands, Missing Data and Sensitivity Analysis: some overview remarks. Roderick Little Estimands, Missing Data and Sensitivity Analysis: some overview remarks Roderick Little NRC Panel s Charge To prepare a report with recommendations that would be useful for USFDA's development of guidance

More information

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN Outline 1. Review sensitivity and specificity 2. Define an ROC curve 3. Define AUC 4. Non-parametric tests for whether or not the test is informative 5. Introduce the binormal ROC model 6. Discuss non-parametric

More information

Bayesian Estimation of a Meta-analysis model using Gibbs sampler

Bayesian Estimation of a Meta-analysis model using Gibbs sampler University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2012 Bayesian Estimation of

More information

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15)

Russian Journal of Agricultural and Socio-Economic Sciences, 3(15) ON THE COMPARISON OF BAYESIAN INFORMATION CRITERION AND DRAPER S INFORMATION CRITERION IN SELECTION OF AN ASYMMETRIC PRICE RELATIONSHIP: BOOTSTRAP SIMULATION RESULTS Henry de-graft Acquah, Senior Lecturer

More information

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests Baylor Health Care System From the SelectedWorks of unlei Cheng 1 A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests unlei Cheng, Baylor Health Care

More information

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES

NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES NEW METHODS FOR SENSITIVITY TESTS OF EXPLOSIVE DEVICES Amit Teller 1, David M. Steinberg 2, Lina Teper 1, Rotem Rozenblum 2, Liran Mendel 2, and Mordechai Jaeger 2 1 RAFAEL, POB 2250, Haifa, 3102102, Israel

More information

THE INDIRECT EFFECT IN MULTIPLE MEDIATORS MODEL BY STRUCTURAL EQUATION MODELING ABSTRACT

THE INDIRECT EFFECT IN MULTIPLE MEDIATORS MODEL BY STRUCTURAL EQUATION MODELING ABSTRACT European Journal of Business, Economics and Accountancy Vol. 4, No. 3, 016 ISSN 056-6018 THE INDIRECT EFFECT IN MULTIPLE MEDIATORS MODEL BY STRUCTURAL EQUATION MODELING Li-Ju Chen Department of Business

More information

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data

Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Bayesian Statistics Estimation of a Single Mean and Variance MCMC Diagnostics and Missing Data Michael Anderson, PhD Hélène Carabin, DVM, PhD Department of Biostatistics and Epidemiology The University

More information

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data Appl. Statist. (2018) 67, Part 1, pp. 145 163 Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data Qiuju Li and Li Su Medical Research Council

More information

Quantifying cancer patient survival; extensions and applications of cure models and life expectancy estimation

Quantifying cancer patient survival; extensions and applications of cure models and life expectancy estimation From the Department of Medical Epidemiology and Biostatistics Karolinska Institutet, Stockholm, Sweden Quantifying cancer patient survival; extensions and applications of cure models and life expectancy

More information

2014 Modern Modeling Methods (M 3 ) Conference, UCONN

2014 Modern Modeling Methods (M 3 ) Conference, UCONN 2014 Modern Modeling (M 3 ) Conference, UCONN Comparative study of two calibration methods for micro-simulation models Department of Biostatistics Center for Statistical Sciences Brown University School

More information

Non-homogenous Poisson Process for Evaluating Stage I & II Ductal Breast Cancer Treatment

Non-homogenous Poisson Process for Evaluating Stage I & II Ductal Breast Cancer Treatment Journal of Modern Applied Statistical Methods Volume 10 Issue 2 Article 23 11-1-2011 Non-homogenous Poisson Process for Evaluating Stage I & II Ductal Breast Cancer Treatment Chris P Tsokos University

More information

Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme

Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme Michael Sweeting Cardiovascular Epidemiology Unit, University of Cambridge Friday 15th

More information

Mediation Analysis With Principal Stratification

Mediation Analysis With Principal Stratification University of Pennsylvania ScholarlyCommons Statistics Papers Wharton Faculty Research 3-30-009 Mediation Analysis With Principal Stratification Robert Gallop Dylan S. Small University of Pennsylvania

More information

Introduction to Survival Analysis Procedures (Chapter)

Introduction to Survival Analysis Procedures (Chapter) SAS/STAT 9.3 User s Guide Introduction to Survival Analysis Procedures (Chapter) SAS Documentation This document is an individual chapter from SAS/STAT 9.3 User s Guide. The correct bibliographic citation

More information

Computerized Mastery Testing

Computerized Mastery Testing Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating

More information

Epidemiological Model of HIV/AIDS with Demographic Consequences

Epidemiological Model of HIV/AIDS with Demographic Consequences Advances in Applied Mathematical Biosciences. ISSN 2248-9983 Volume 5, Number 1 (2014), pp. 65-74 International Research Publication House http://www.irphouse.com Epidemiological Model of HIV/AIDS with

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

Bayesian Joint Modelling of Benefit and Risk in Drug Development

Bayesian Joint Modelling of Benefit and Risk in Drug Development Bayesian Joint Modelling of Benefit and Risk in Drug Development EFSPI/PSDM Safety Statistics Meeting Leiden 2017 Disclosure is an employee and shareholder of GSK Data presented is based on human research

More information

Lecture II: Difference in Difference. Causality is difficult to Show from cross

Lecture II: Difference in Difference. Causality is difficult to Show from cross Review Lecture II: Regression Discontinuity and Difference in Difference From Lecture I Causality is difficult to Show from cross sectional observational studies What caused what? X caused Y, Y caused

More information

Calibrating Time-Dependent One-Year Relative Survival Ratio for Selected Cancers

Calibrating Time-Dependent One-Year Relative Survival Ratio for Selected Cancers ISSN 1995-0802 Lobachevskii Journal of Mathematics 2018 Vol. 39 No. 5 pp. 722 729. c Pleiades Publishing Ltd. 2018. Calibrating Time-Dependent One-Year Relative Survival Ratio for Selected Cancers Xuanqian

More information

Decision Making in Confirmatory Multipopulation Tailoring Trials

Decision Making in Confirmatory Multipopulation Tailoring Trials Biopharmaceutical Applied Statistics Symposium (BASS) XX 6-Nov-2013, Orlando, FL Decision Making in Confirmatory Multipopulation Tailoring Trials Brian A. Millen, Ph.D. Acknowledgments Alex Dmitrienko

More information

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 3: Overview of Descriptive Statistics October 3, 2005 Lecture Outline Purpose

More information

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design

T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design T-Statistic-based Up&Down Design for Dose-Finding Competes Favorably with Bayesian 4-parameter Logistic Design James A. Bolognese, Cytel Nitin Patel, Cytel Yevgen Tymofyeyef, Merck Inna Perevozskaya, Wyeth

More information

TWISTED SURVIVAL: IDENTIFYING SURROGATE ENDPOINTS FOR MORTALITY USING QTWIST AND CONDITIONAL DISEASE FREE SURVIVAL. Beth A.

TWISTED SURVIVAL: IDENTIFYING SURROGATE ENDPOINTS FOR MORTALITY USING QTWIST AND CONDITIONAL DISEASE FREE SURVIVAL. Beth A. TWISTED SURVIVAL: IDENTIFYING SURROGATE ENDPOINTS FOR MORTALITY USING QTWIST AND CONDITIONAL DISEASE FREE SURVIVAL by Beth A. Zamboni BS Statistics, University of Pittsburgh, 1997 MS Biostatistics, Harvard

More information

UW Biostatistics Working Paper Series

UW Biostatistics Working Paper Series UW Biostatistics Working Paper Series Year 2005 Paper 242 Bayesian Evaluation of Group Sequential Clinical Trial Designs Scott S. Emerson University of Washington Daniel L. Gillen University of California,

More information

Chapter 13 Cancer of the Female Breast

Chapter 13 Cancer of the Female Breast Lynn A. Gloeckler Ries and Milton P. Eisner INTRODUCTION This study presents survival analyses for female breast cancer based on 302,763 adult cases from the Surveillance, Epidemiology, and End Results

More information

MEASURING THE UNDIAGNOSED FRACTION:

MEASURING THE UNDIAGNOSED FRACTION: Friday, May 27, 2016 SPRC-PHSKC Lunchbox Talks 1 MEASURING THE UNDIAGNOSED FRACTION: Understanding the UW and CDC back-calculation models Martina Morris, PhD Director, UW CFAR SPRC Jeanette K Birnbaum,

More information

Network Science: Principles and Applications

Network Science: Principles and Applications Network Science: Principles and Applications CS 695 - Fall 2016 Amarda Shehu,Fei Li [amarda, lifei](at)gmu.edu Department of Computer Science George Mason University Spreading Phenomena: Epidemic Modeling

More information

Conditional spectrum-based ground motion selection. Part II: Intensity-based assessments and evaluation of alternative target spectra

Conditional spectrum-based ground motion selection. Part II: Intensity-based assessments and evaluation of alternative target spectra EARTHQUAKE ENGINEERING & STRUCTURAL DYNAMICS Published online 9 May 203 in Wiley Online Library (wileyonlinelibrary.com)..2303 Conditional spectrum-based ground motion selection. Part II: Intensity-based

More information

Response to Comment on Cognitive Science in the field: Does exercising core mathematical concepts improve school readiness?

Response to Comment on Cognitive Science in the field: Does exercising core mathematical concepts improve school readiness? Response to Comment on Cognitive Science in the field: Does exercising core mathematical concepts improve school readiness? Authors: Moira R. Dillon 1 *, Rachael Meager 2, Joshua T. Dean 3, Harini Kannan

More information

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions XI Biennial Conference of the International Biometric Society (Indian Region) on Computational Statistics and Bio-Sciences, March 8-9, 22 43 Estimation of Area under the ROC Curve Using Exponential and

More information

Advanced Bayesian Models for the Social Sciences

Advanced Bayesian Models for the Social Sciences Advanced Bayesian Models for the Social Sciences Jeff Harden Department of Political Science, University of Colorado Boulder jeffrey.harden@colorado.edu Daniel Stegmueller Department of Government, University

More information

Maria-Athina Altzerinakou1, Xavier Paoletti2. 9 May, 2017

Maria-Athina Altzerinakou1, Xavier Paoletti2. 9 May, 2017 An adaptive design for the identification of the optimal dose using joint modelling of efficacy and toxicity in phase I/II clinical trials of molecularly targeted agents Maria-Athina Altzerinakou1, Xavier

More information

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY

A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY A COMPARISON OF IMPUTATION METHODS FOR MISSING DATA IN A MULTI-CENTER RANDOMIZED CLINICAL TRIAL: THE IMPACT STUDY Lingqi Tang 1, Thomas R. Belin 2, and Juwon Song 2 1 Center for Health Services Research,

More information

Ordinal Data Modeling

Ordinal Data Modeling Valen E. Johnson James H. Albert Ordinal Data Modeling With 73 illustrations I ". Springer Contents Preface v 1 Review of Classical and Bayesian Inference 1 1.1 Learning about a binomial proportion 1 1.1.1

More information

Individual Differences in Attention During Category Learning

Individual Differences in Attention During Category Learning Individual Differences in Attention During Category Learning Michael D. Lee (mdlee@uci.edu) Department of Cognitive Sciences, 35 Social Sciences Plaza A University of California, Irvine, CA 92697-5 USA

More information

Bayesian approaches to handling missing data: Practical Exercises

Bayesian approaches to handling missing data: Practical Exercises Bayesian approaches to handling missing data: Practical Exercises 1 Practical A Thanks to James Carpenter and Jonathan Bartlett who developed the exercise on which this practical is based (funded by ESRC).

More information

Bayesian Joinpoint Regression Model for Childhood Brain Cancer Mortality

Bayesian Joinpoint Regression Model for Childhood Brain Cancer Mortality Journal of Modern Applied Statistical Methods Volume 12 Issue 2 Article 22 11-1-2013 Bayesian Joinpoint Regression Model for Childhood Brain Cancer Mortality Ram C. Kafle University of South Florida, Tampa,

More information

Selection of Linking Items

Selection of Linking Items Selection of Linking Items Subset of items that maximally reflect the scale information function Denote the scale information as Linear programming solver (in R, lp_solve 5.5) min(y) Subject to θ, θs,

More information

Statistical methods for the meta-analysis of full ROC curves

Statistical methods for the meta-analysis of full ROC curves Statistical methods for the meta-analysis of full ROC curves Oliver Kuss (joint work with Stefan Hirt and Annika Hoyer) German Diabetes Center, Leibniz Institute for Diabetes Research at Heinrich Heine

More information

A Cue Imputation Bayesian Model of Information Aggregation

A Cue Imputation Bayesian Model of Information Aggregation A Cue Imputation Bayesian Model of Information Aggregation Jennifer S. Trueblood, George Kachergis, and John K. Kruschke {jstruebl, gkacherg, kruschke}@indiana.edu Cognitive Science Program, 819 Eigenmann,

More information

Winnebago County, Illinois. Review of Winnebago County Cancer Mortality and Incidence ( )

Winnebago County, Illinois. Review of Winnebago County Cancer Mortality and Incidence ( ) Winnebago County, Illinois Review of Winnebago County Cancer Mortality and Incidence (1988 2008) August 2011 Review of Winnebago County Cancer Mortality and Incidence August 2011 Background: Cancer is

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel

Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel Estimating and comparing cancer progression risks under varying surveillance protocols: moving beyond the Tower of Babel Jane Lange March 22, 2017 1 Acknowledgements Many thanks to the multiple project

More information

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) *

A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * A review of statistical methods in the analysis of data arising from observer reliability studies (Part 11) * by J. RICHARD LANDIS** and GARY G. KOCH** 4 Methods proposed for nominal and ordinal data Many

More information

PROFILE SIMILARITY IN BIOEQUIVALENCE TRIALS

PROFILE SIMILARITY IN BIOEQUIVALENCE TRIALS Sankhyā : The Indian Journal of Statistics Special Issue on Biostatistics 2000, Volume 62, Series B, Pt. 1, pp. 149 161 PROFILE SIMILARITY IN BIOEQUIVALENCE TRIALS By DAVID T. MAUGER and VERNON M. CHINCHILLI

More information

Cancer survival and prevalence in Tasmania

Cancer survival and prevalence in Tasmania Cancer survival and prevalence in Tasmania 1978-2008 Cancer survival and prevalence in Tasmania 1978-2008 Tasmanian Cancer Registry University of Tasmania Menzies Research Institute Tasmania 17 Liverpool

More information

Small-area estimation of mental illness prevalence for schools

Small-area estimation of mental illness prevalence for schools Small-area estimation of mental illness prevalence for schools Fan Li 1 Alan Zaslavsky 2 1 Department of Statistical Science Duke University 2 Department of Health Care Policy Harvard Medical School March

More information

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill)

Advanced Bayesian Models for the Social Sciences. TA: Elizabeth Menninga (University of North Carolina, Chapel Hill) Advanced Bayesian Models for the Social Sciences Instructors: Week 1&2: Skyler J. Cranmer Department of Political Science University of North Carolina, Chapel Hill skyler@unc.edu Week 3&4: Daniel Stegmueller

More information

Mathematical Structure & Dynamics of Aggregate System Dynamics Infectious Disease Models 2. Nathaniel Osgood CMPT 394 February 5, 2013

Mathematical Structure & Dynamics of Aggregate System Dynamics Infectious Disease Models 2. Nathaniel Osgood CMPT 394 February 5, 2013 Mathematical Structure & Dynamics of Aggregate System Dynamics Infectious Disease Models 2 Nathaniel Osgood CMPT 394 February 5, 2013 Recall: Kendrick-McKermack Model Partitioning the population into 3

More information

Design for Targeted Therapies: Statistical Considerations

Design for Targeted Therapies: Statistical Considerations Design for Targeted Therapies: Statistical Considerations J. Jack Lee, Ph.D. Department of Biostatistics University of Texas M. D. Anderson Cancer Center Outline Premise General Review of Statistical Designs

More information

Summary Report Report on Cancer Statistics in Alberta. February Surveillance and Health Status Assessment Cancer Surveillance

Summary Report Report on Cancer Statistics in Alberta. February Surveillance and Health Status Assessment Cancer Surveillance Summary Report 2008 Report on Cancer Statistics in Alberta February 2011 November 25, 2011 ERRATUM: Summary Report, 2008 Report on Cancer Statistics in Alberta There was an error in the spelling of prostate

More information

Estimating Testicular Cancer specific Mortality by Using the Surveillance Epidemiology and End Results Registry

Estimating Testicular Cancer specific Mortality by Using the Surveillance Epidemiology and End Results Registry Appendix E1 Estimating Testicular Cancer specific Mortality by Using the Surveillance Epidemiology and End Results Registry To estimate cancer-specific mortality for 33-year-old men with stage I testicular

More information

Rare Urological Cancers Urological Cancers SSCRG

Rare Urological Cancers Urological Cancers SSCRG Rare Urological Cancers Urological Cancers SSCRG Public Health England South West Knowledge & Intelligence Team 1 Introduction Rare urological cancers are defined here as cancer of the penis, testes, ureter

More information

Remarks on Bayesian Control Charts

Remarks on Bayesian Control Charts Remarks on Bayesian Control Charts Amir Ahmadi-Javid * and Mohsen Ebadi Department of Industrial Engineering, Amirkabir University of Technology, Tehran, Iran * Corresponding author; email address: ahmadi_javid@aut.ac.ir

More information

Monte Carlo Analysis of Univariate Statistical Outlier Techniques Mark W. Lukens

Monte Carlo Analysis of Univariate Statistical Outlier Techniques Mark W. Lukens Monte Carlo Analysis of Univariate Statistical Outlier Techniques Mark W. Lukens This paper examines three techniques for univariate outlier identification: Extreme Studentized Deviate ESD), the Hampel

More information

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology

Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Bayesian graphical models for combining multiple data sources, with applications in environmental epidemiology Sylvia Richardson 1 sylvia.richardson@imperial.co.uk Joint work with: Alexina Mason 1, Lawrence

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

GENERAL ICD- C61: Prostate cancer Page 2 of 18 ICD- C61: Malignant neoplasm of prostate Period of diagnosis % Relative survival N=46,

GENERAL ICD- C61: Prostate cancer Page 2 of 18 ICD- C61: Malignant neoplasm of prostate Period of diagnosis % Relative survival N=46, Munich Cancer Registry Incidence and Mortality Selection Matrix Homepage Deutsch ICD- C61: Prostate cancer Survival Year of diagnosis 1988-1997 1998-215 Patients 7,45 49,513 Diseases 7,45 49,514 Cases

More information

JSM Survey Research Methods Section

JSM Survey Research Methods Section Methods and Issues in Trimming Extreme Weights in Sample Surveys Frank Potter and Yuhong Zheng Mathematica Policy Research, P.O. Box 393, Princeton, NJ 08543 Abstract In survey sampling practice, unequal

More information

Motivation Empirical models Data and methodology Results Discussion. University of York. University of York

Motivation Empirical models Data and methodology Results Discussion. University of York. University of York Healthcare Cost Regressions: Going Beyond the Mean to Estimate the Full Distribution A. M. Jones 1 J. Lomas 2 N. Rice 1,2 1 Department of Economics and Related Studies University of York 2 Centre for Health

More information

Model calibration and Bayesian methods for probabilistic projections

Model calibration and Bayesian methods for probabilistic projections ETH Zurich Reto Knutti Model calibration and Bayesian methods for probabilistic projections Reto Knutti, IAC ETH Toy model Model: obs = linear trend + noise(variance, spectrum) 1) Short term predictability,

More information

Ethnic Disparities in the Treatment of Stage I Non-small Cell Lung Cancer. Juan P. Wisnivesky, MD, MPH, Thomas McGinn, MD, MPH, Claudia Henschke, PhD,

Ethnic Disparities in the Treatment of Stage I Non-small Cell Lung Cancer. Juan P. Wisnivesky, MD, MPH, Thomas McGinn, MD, MPH, Claudia Henschke, PhD, Ethnic Disparities in the Treatment of Stage I Non-small Cell Lung Cancer Juan P. Wisnivesky, MD, MPH, Thomas McGinn, MD, MPH, Claudia Henschke, PhD, MD, Paul Hebert, PhD, Michael C. Iannuzzi, MD, and

More information

Advanced IPD meta-analysis methods for observational studies

Advanced IPD meta-analysis methods for observational studies Advanced IPD meta-analysis methods for observational studies Simon Thompson University of Cambridge, UK Part 4 IBC Victoria, July 2016 1 Outline of talk Usual measures of association (e.g. hazard ratios)

More information

Statistical Tolerance Regions: Theory, Applications and Computation

Statistical Tolerance Regions: Theory, Applications and Computation Statistical Tolerance Regions: Theory, Applications and Computation K. KRISHNAMOORTHY University of Louisiana at Lafayette THOMAS MATHEW University of Maryland Baltimore County Contents List of Tables

More information

Chapter 1. Introduction

Chapter 1. Introduction Chapter 1 Introduction 1.1 Motivation and Goals The increasing availability and decreasing cost of high-throughput (HT) technologies coupled with the availability of computational tools and data form a

More information

Mathematical Modeling for the Tumor Growth in the Bladder

Mathematical Modeling for the Tumor Growth in the Bladder IJISET - International Journal of Innovative Science, Engineering & Technology, Vol. 2 Issue 5, May 25. Mathematical Modeling for the Tumor Growth in the Bladder L. N. M. Tawfiq & S. A. Abdul-Jabbar Department

More information

Individualized Treatment Effects Using a Non-parametric Bayesian Approach

Individualized Treatment Effects Using a Non-parametric Bayesian Approach Individualized Treatment Effects Using a Non-parametric Bayesian Approach Ravi Varadhan Nicholas C. Henderson Division of Biostatistics & Bioinformatics Department of Oncology Johns Hopkins University

More information

TESTING at any level (e.g., production, field, or on-board)

TESTING at any level (e.g., production, field, or on-board) IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, VOL. 54, NO. 3, JUNE 2005 1003 A Bayesian Approach to Diagnosis and Prognosis Using Built-In Test John W. Sheppard, Senior Member, IEEE, and Mark A.

More information

Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification

Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification RESEARCH HIGHLIGHT Two-stage Methods to Implement and Analyze the Biomarker-guided Clinical Trail Designs in the Presence of Biomarker Misclassification Yong Zang 1, Beibei Guo 2 1 Department of Mathematical

More information