Supplementary Information for Avoidable deaths and random variation in patients survival

Similar documents
Overview. All-cause mortality for males with colon cancer and Finnish population. Relative survival

Estimating and modelling relative survival

BIOSTATISTICAL METHODS AND RESEARCH DESIGNS. Xihong Lin Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA

Application of EM Algorithm to Mixture Cure Model for Grouped Relative Survival Data

Treatment effect estimates adjusted for small-study effects via a limit meta-analysis

Poisson regression. Dae-Jin Lee Basque Center for Applied Mathematics.

Biostatistics II

A Bayesian Perspective on Unmeasured Confounding in Large Administrative Databases

An Overview of Survival Statistics in SEER*Stat

The Late Pretest Problem in Randomized Control Trials of Education Interventions

Statistical Models for Bias and Overdiagnosis in Prostate Cancer Screening

Catherine A. Welch 1*, Séverine Sabia 1,2, Eric Brunner 1, Mika Kivimäki 1 and Martin J. Shipley 1

Quantifying cancer patient survival; extensions and applications of cure models and life expectancy estimation

Sample size and power calculations in Mendelian randomization with a single instrumental variable and a binary outcome

TWISTED SURVIVAL: IDENTIFYING SURROGATE ENDPOINTS FOR MORTALITY USING QTWIST AND CONDITIONAL DISEASE FREE SURVIVAL. Beth A.

Estimating HIV incidence in the United States from HIV/AIDS surveillance data and biomarker HIV test results

CLINICAL BIOSTATISTICS

STATISTICS IN CLINICAL AND TRANSLATIONAL RESEARCH

Assessment of lead-time bias in estimates of relative survival for breast cancer

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach

APPENDIX D REFERENCE AND PREDICTIVE VALUES FOR PEAK EXPIRATORY FLOW RATE (PEFR)

Understandable Statistics

Estimating and Modelling the Proportion Cured of Disease in Population Based Cancer Studies

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11

SUPPLEMENTARY MATERIAL. Impact of Vaccination on 14 High-Risk HPV type infections: A Mathematical Modelling Approach

EPI 200C Final, June 4 th, 2009 This exam includes 24 questions.

Statistical Science Issues in HIV Vaccine Trials: Part I

Biological Cure. Margaret R. Stedman, Ph.D. MPH Angela B. Mariotto, Ph.D.

Data Analysis Using Regression and Multilevel/Hierarchical Models

Problem set 2: understanding ordinary least squares regressions

Measuring cancer survival in populations: relative survival vs cancer-specific survival

Statistical Models for Censored Point Processes with Cure Rates

Applied Medical. Statistics Using SAS. Geoff Der. Brian S. Everitt. CRC Press. Taylor Si Francis Croup. Taylor & Francis Croup, an informa business

Sarwar Islam Mozumder 1, Mark J Rutherford 1 & Paul C Lambert 1, Stata London Users Group Meeting

Acarbose Decreases the Rheumatoid Arthritis Risk of Diabetic Patients and. Attenuates the Incidence and Severity of Collagen-induced Arthritis in Mice

ASSESSMENT OF LEAD-TIME BIAS IN ESTIMATES OF RELATIVE SURVIVAL FOR BREAST CANCER

Statistical Tolerance Regions: Theory, Applications and Computation

Non-homogenous Poisson Process for Evaluating Stage I & II Ductal Breast Cancer Treatment

Supplementary Appendix

Analysing population-based cancer survival settling the controversies

Project for Math. 224 DETECTION OF DIABETES

Accommodating informative dropout and death: a joint modelling approach for longitudinal and semicompeting risks data

Chapter 13 Estimating the Modified Odds Ratio

Using dynamic prediction to inform the optimal intervention time for an abdominal aortic aneurysm screening programme

Joint Modelling of Event Counts and Survival Times: Example Using Data from the MESS Trial

Modelled prevalence. Marc COLONNA. January 22-23, 2014 Ispra

Part [1.0] Introduction to Development and Evaluation of Dynamic Predictions

Dynamic prediction using joint models for recurrent and terminal events: Evolution after a breast cancer

Supplementary Materials

CLASSICAL AND. MODERN REGRESSION WITH APPLICATIONS

GENERALIZED ESTIMATING EQUATIONS FOR LONGITUDINAL DATA. Anti-Epileptic Drug Trial Timeline. Exploratory Data Analysis. Exploratory Data Analysis

Matched Cohort designs.

Modelling Spatially Correlated Survival Data for Individuals with Multiple Cancers

An Examination of Factors Affecting Incidence and Survival in Respiratory Cancers. Katie Frank Roberto Perez Mentor: Dr. Kate Cowles.

Methodology for the Survival Estimates

arxiv: v2 [stat.ap] 7 Dec 2016

Modelling prognostic capabilities of tumor size: application to colorectal cancer

Stepwise Knowledge Acquisition in a Fuzzy Knowledge Representation Framework

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Optimal full matching for survival outcomes: a method that merits more widespread use

Adverse Outcomes After Hospitalization and Delirium in Persons With Alzheimer Disease

Supplementary Appendix

BIOSTATISTICAL METHODS

breast cancer; relative risk; risk factor; standard deviation; strength of association

Investigation of relative survival from colorectal cancer between NHS organisations

Appendix: Supplementary material [posted as supplied by author]

MEA DISCUSSION PAPERS

Heritability. The Extended Liability-Threshold Model. Polygenic model for continuous trait. Polygenic model for continuous trait.

Importance of factors contributing to work-related stress: comparison of four metrics

Adjusting for mode of administration effect in surveys using mailed questionnaire and telephone interview data

The Statistical Analysis of Failure Time Data

Math 215, Lab 7: 5/23/2007

Day Hospital versus Ordinary Hospitalization: factors in treatment discrimination

Anale. Seria Informatică. Vol. XVI fasc Annals. Computer Science Series. 16 th Tome 1 st Fasc. 2018

Modern Regression Methods

14. Linear Mixed-Effects Models for Data from Split-Plot Experiments

Landmarking, immortal time bias and. Dynamic prediction

COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE

Case-Cohort Approach to Assessing Immunological Correlates of Risk, With Application to Vax004. Biostat 578A: Lecture 11

Deterministic Compartmental Models of Disease

Division of Biostatistics College of Public Health Qualifying Exam II Part I. 1-5 pm, June 7, 2013 Closed Book

Example 7.2. Autocorrelation. Pilar González and Susan Orbe. Dpt. Applied Economics III (Econometrics and Statistics)

Analysis of Vaccine Effects on Post-Infection Endpoints Biostat 578A Lecture 3

Supplementary Appendix

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

M. J. Rutherford 1,*, T. M-L. Andersson 2, H. Møller 3, P.C. Lambert 1,2.

Regression analysis of mortality with respect to seasonal influenza in Sweden

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

Bayesian approaches to handling missing data: Practical Exercises

patients actual drug exposure for every single-day of contribution to monthly cohorts, either before or

Generalized Estimating Equations for Depression Dose Regimes

Comparisons of Dynamic Treatment Regimes using Observational Data

Received: 14 April 2016, Accepted: 28 October 2016 Published online 1 December 2016 in Wiley Online Library

Cancer survival and prevalence in Tasmania

Trends in the Lifetime Risk of Developing Cancer in Ontario, Canada

F i t p o w e r 5 a n d p o i s s o n A g e- Period-Cohort models for prediction of cancer incidence

Mediation Analysis With Principal Stratification

SPRING GROVE AREA SCHOOL DISTRICT. Course Description. Instructional Strategies, Learning Practices, Activities, and Experiences.

Small-area estimation of mental illness prevalence for schools

Transcription:

Supplementary Information for Avoidable deaths and random variation in patients survival by K Seppä, T Haulinen and E Läärä

Supplementary Appendix Relative survival and excess hazard of death The relative survival ratio S R (t) = S(t)/S (t), that is the ratio of the observed survival proportion S(t) of the patients and the expected survival S (t), the latter being derived from a comparable reference population at time t from diagnosis, was used to measure the net survival in the five cancer control regions, i.e., the hypothetical survival in the absence of other causes of death. The observed and expected hazard functions λ(t) and λ (t), respectively, are the rates of death per unit of time in the patients and in the reference population, respectively, at time t. The excess hazard function of death due to colon cancer γ(t) = λ(t) λ (t) is the excess rate of death of the patients as compared with the death rate in the reference population. The functions λ (t) and γ(t) were derived based on the nown relations between the survival and hazard functions: S (t) = exp { t 0 λ (u) du } and S R (t) = exp { t γ(u) du}. 0 The expected hazard of death for a patient i in follow-up interval j was determined by region r i (r = 1,..., 5), sex s i (s = 0 for males and s = 1 for females), calendar year v ij (v = 2000,..., 2009) and age a ij in years (a = 0,..., 99), i.e. λ ij = λ r i s i v ij a ij. The expected hazard λ rsva was estimated by dividing the pertinent number of deaths by the mid-year population count in each stratum of region, sex, calendar year and age. Region-specific population counts and deaths were obtained from Statistics Finland for each calendar year from 2000 to 2009 (Statistics Finland 2011). Because the numbers of deaths by cancer control region were only available in 5-year age groups, the regional hazard of death was assumed to be proportional to the hazard of death in the whole country within the 5-year age groups in order to estimate the region-specific hazards in 1-year age groups. The expected survival of a group of patients was estimated using the Ederer II method (Ederer and Heise 1959; Haulinen et al, 2011). Traditional direct standardisation by age and sex was used to compare the 5-1

year relative survival ratios and the excess and expected hazards of death, respectively, as some differences exist in the age and sex structures of the patients across the regions (Pohrel and Haulinen, 2008). The age-standardisation was based on five age groups: 0 44, 45 54, 55 64, 65 74 and 75 89 years at diagnosis, and the age and sex structure of all patients diagnosed in 2000 2007 was used as the standard. The excess hazard of death γ ij for patient i in follow-up interval j was modelled as a multiplicative function of covariates z il (l = 1,..., b), i.e. γ ij = exp{z i1 β 1 + + z ib β b } where regression coefficient β l is interpreted as the additive effect of covariate z il on the logarithm of the excess hazard. The model included sex, cancer control region (5 levels), age group (the same five categories as in the agestandardisation) and follow-up time (0 3 months, 4 12 months, and four annual intervals from 1 to 5 years) as categorical covariates. Interaction terms between age and follow-up time were included to allow non-proportional excess hazards by the age groups. In addition, interaction terms between age and sex were included in the model. This relative survival regression model was fitted in the framewor of generalized linear models using exact survival times and individual subject-band observations (Dicman et al, 2004). This implies a Poisson distribution for the indicator of death d ij of patient i in interval j with lin function ln(µ ij + λ ijy ij ) and offset ln(y ij ), where y ij is the time at ris of patient i in interval j and µ ij = (λ ij + γ ij )y ij is the expected value for the death indicator d ij. Numbers of deaths from cancer and from other causes In order to estimate the numbers of deaths from the target cancer and from all the other causes, respectively, the crude probability of death due to the cancer and to other causes, respectively, were obtained using the theory of competing riss (hiang 1968, p. 245). The crude conditional probability that a patient 2

alive at x j will die from cancer in interval j is q c j = xj+1 x j { t } exp λ(u) du γ(t) dt x j where the observed total hazard of death λ(t) is expressed as the sum of the expected and the excess hazard: λ(t) = λ (t)+γ(t). If the hazards are assumed to be constants within the intervals, the probability of dying from cancer in interval j can be written as q c j = γ j λ j + γ q j (1) j where λ j and γ j are the interval-specific expected and excess hazards, respectively, and q j = 1 p j = 1 exp{ (λ j + γ j ) j } is the conditional probability of death in interval j, when j is the length of the interval j, given survival until the beginning of the interval. The cumulative crude probability of dying from cancer during the first intervals is Q c = q1 c + p 1 q2 c + p 1 p 2 q3 c + + p 1 p 2 p 1 q. c The number of deaths from cancer accumulated during the first intervals D c can be estimated as a sum over the cumulative crude probabilities of n patients: D c = n i=1 Qc i where the cumulative crude probability of patient i is given by replacing γ j and λ j in formula (1) by individual estimates γ ij and λ ij = λ r i s i v ij a ij. Fitted (predicted) values of the excess hazard of the relative survival regression model were used for patient i in the first followup intervals, even if the follow-up time of the patient was censored or the patient died during the intervals. For v ij > 2009, the expected hazard of year 2009 was used. The number of deaths from other causes accumulated during the first intervals D o is obtained by writing the probability of dying from competing causes of death other than the cancer qj o = λ jq j /(λ j +γ j ). The total number of deaths is written as a sum of the number of deaths from cancer and other causes, i.e. D T = Dc + Do. 3

Numbers and proportions of avoidable deaths The hypothetical number D,S of deaths from cause (c=cancer, o=other causes, T=any cause) accumulated during the first intervals was calculated under three different scenarios S (A, B, AB). In scenario A, the excess hazard γ ij was replaced with the corresponding excess hazard in region 1 (specific to sex, age group and follow-up interval). In scenario B, the expected hazard λ ij was replaced with the corresponding expected hazard in region 2 (matched by sex, calendar year and year of age). In scenario AB, both γ ij and λ ij were replaced with those in regions 1 and 2, respectively. The number and proportion of avoidable deaths from cause accumulated during the first intervals in scenario S are written as Diff, S = D D, S and Prop, S = (D D, S )/D, respectively, where D is the true number of deaths from cause accumulated during the first intervals. Variances of the estimators Let α rsva be the natural logarithm of the expected hazard of death defined by region r, sex s, calendar year v and year of age a, i.e. λ rsva = exp{α rsva }. The variances for the number of deaths from cause and for the number and proportion of avoidable deaths from cause in scenario S were approximated by the delta method (asella and Berger 2001): Var(D ) l,m D Var(Diff,S ) ( D l,m + D ov( β ˆβ l, ˆβ m ) + m r,s,v,a D,S ( D r,s,v,a )( D ( D α rsva D,S β m β m ) 2 Var(ˆα rsva), ) ov( ˆβ l, ˆβ m ) ) 2 D,S Var(ˆα rsva), and α rsva α rsva 4

Var(Prop,S ) { l,m ( D D,S D,S D + ( D D,S α rsva r,s,v,a )( D D,S β m D,S D α rsva D,S D β m ) 2 Var(ˆα rsva)} ) ov( ˆβ l, ˆβ m ) (D ) 4. The estimated covariances of the estimates of the β parameters were provided by the iterative weighted least squares algorithm used to fit the generalized linear model of relative survival. The variance of the estimate of the logarithm of the expected hazard of death Var(ˆα rsva ) was estimated by the inverse of the number of deaths in national population stratified by region, sex, calendar year and age group. The partial derivatives of the number of deaths from cancer, from other causes, and from any cause with respect to β l parameter are given by D c = D o = and D T n i=1 m=1 n i=1 m=1 = Dc { P i,m 1 qim c { P i,m 1 qim o + Do = respectively, where n I βl (γ im )q 1 im ( mγ im p im + q o im) I βl (γ im )q 1 im ( mγ im p im q c im) i=1 m=1 P im { I βl (γ im ) m γ im }, m 1 j=1 m 1 j=1 I βl (γ ij ) j γ ij }, I βl (γ ij ) j γ ij }, P im = m j=0 p ij is the cumulative probability for patient i to survive at least until the end of interval m where in particular p i0 = 1. q im = 1 p im, q c im and q o im are the probability of death and the crude probabilities of dying from cancer and from other causes, respectively, for patient i in interval m. I βl (γ ij ) is an indicator equalling 1, if regression coefficient β l is included in the predicted excess hazard γ ij of patient i in follow-up interval j, and I βl (γ ij ) = 0 otherwise. 5

Partial derivatives D c/ α rsva, D o/ α rsva and D T/ α rsva are otherwise similar to D o/, D c/ and D T/, respectively, but qim, c qim, o γ ij and I βl are replaced with qim, o qim, c λ ij and I αrsva, respectively, where I αrsva (λ ij) = 1, if parameter α rsva is included in the expected hazard λ ij of patient i in followup interval j, and I αrsva (λ ij) = 0 otherwise. The random variation in the expected hazard rates was taen into account in the calculation of the variances after obtaining the estimated covariance matrix of the estimators of the regression coefficients of the excess hazard. However, the expected hazard rates estimated from large regional populations were considered as being essentially free from random error in the estimation of the excess hazard, otherwise the relative survival regression model could not be fitted within the framewor of generalized linear models. Statistical software The relative survival regression model can be easily fitted using any software that allows the estimation of generalised linear models with user-defined lin functions (Dicman et al, 2004). We used R environment for statistical computing and graphics in the all analysis (R Development ore Team 2012). First, glm function was used in fitting the relative survival model. Then, the numbers of deaths, the numbers and proportions of avoidable deaths and their variances were estimated using the explicit formulae presented above. The scripts are available from the first author on request. References asella G, Berger RL (2001) Statistical Inference, 2nd Edition. Duxbury Press: Pacific Grove, A hiang L (1968) Introduction to Stochastic Processes in Biostatistics. Wiley: New Yor 6

Dicman PW, Sloggett A, Hills M, Haulinen T (2004) Regression models for relative survival. Statistics in Medicine 23: 51 64 Ederer F, Heise H (1959) Instructions to IBM 650 programmers in processing survival computations. Methodological note no. 10. End Results Evaluation Section, National ancer Institute: Bethesda, MD Haulinen T, Seppä K, Lambert P (2011) hoosing the relative survival method for cancer survival estimation. European Journal of ancer 47: 2202 2210 Pohrel A, Haulinen T (2008) How to interpret the relative survival ratios of cancer patients. European Journal of ancer 44: 2661 2667 R Development ore Team (2012) R: A Language and Environment for Statistical omputing, version 2.14.2. R foundation for Statistical omputing, Vienna, Austria. URL http://www.r-project.org, accessed 19 March 2012. Statistics Finland (2011) StatFin online service. URL http://pxweb2.stat.fi/ database/statfin/databasetree en.asp, accessed 19 March 2012 7

Supplementary Table 1: Age distributions (%) of colon cancer patients diagnosed in Finland in 2000 2007 by cancer control region. Region 0 44 45 54 55 64 65 74 75 89 All: 0 89 1 177 (5) 309 (9) 677 (20) 945 (28) 1260 (37) 3368 (100) 2 49 (3) 114 (7) 290 (17) 518 (30) 732 (43) 1703 (100) 3 119 (4) 222 (8) 454 (17) 772 (29) 1108 (41) 2675 (100) 4 85 (5) 167 (9) 349 (19) 519 (28) 718 (39) 1838 (100) 5 69 (6) 99 (8) 219 (18) 351 (30) 450 (38) 1188 (100) Total 499 (5) 911 (8) 1989 (18) 3105 (29) 4268 (40) 10772 (100) Supplementary Table 2: Age-specific and -standardised 5-year relative survival ratios (%) for colon cancer patients diagnosed in Finland in 2000 2007 by cancer control region. All the estimates were standardised by sex. Region 0 44 45 54 55 64 65 74 75 89 All: 0 89 95% I 1 78 68 62 63 57 61 (59, 64) 2 79 62 63 63 55 60 (57, 64) 3 80 58 60 60 61 61 (58, 64) 4 84 60 60 58 54 58 (55, 62) 5 75 61 62 61 56 60 (55, 64) Total 79 63 61 61 57 60 (59, 62) 8