U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES Trends and patterns of childhood cancer incidence in the US, 1995 2010 Li Zhu National Cancer Institute Linda Pickle StatNet Consulting, LLC Joe Zou Information Management Services, Inc National Institutes of Health
Cancer Facts & Figures American Cancer Society s Cancer Facts & Figures (CFF) estimates # new cases for current year #1 cited cancer reference, so accurate #s important Since 2007, spatio-temporal model used Considered geographical variations Adjust for delays in case reporting Predictions by state/sites for all pop Childhood cancer (age 0-14): an overall estimate given for US
Reason for drop in Childhood cancer mortality rate: participation in clinical trials Source: MMWR 2007 / 56(48);1257-1261
Reason for drop in Childhood cancer mortality rate: participation in clinical trials Source: MMWR 2007 / 56(48);1257-1261
Outline of Talk Project goal Data Spatio-temporal modeling of incidence Output Tables Maps Micromaps Joinpoint predictions Conclusion and future work
Goal of project Predict # new cancer cases of childhood cancer (age 0-19) in each data year by state, gender, race, age group, major cancer site, and project expected #s ahead to current calendar year
Goal of project Predict # new cancer cases of childhood cancer (age 0-19) in each data year by state, gender, race, age group, major cancer site, and project expected #s ahead to current calendar year Method used since 2007 for total pop.
Goal of project Predict # new cancer cases of childhood cancer (age 0-19) in each data year by state, gender, race, age group, major cancer site, and project expected #s ahead to current calendar year Method used since 2007 for total pop. Uses of predicted counts by small area State Health Service Area (HSA), 944 in total
Challenges The annual # new cancer cases (incidence) in the total US is unknown; no national registry Not all US states have data on # of new cancer cases: A few registries are too new to produce high quality data State laws prevent release of stratified # of cases Publications may mask #s that are < threshold Sparse data: all sites incidence rate: 470 for all ages vs. 17 for age 0-19
AK HI U.S. Cancer Registries SEER Program (NCI) (1973+) NPCR Program (CDC) (1995+)
AK HI U.S. Cancer Registries SEER Program (NCI) (1973+) NPCR Program (CDC) (1995+) Certification Process & Active Consent
General Methods for Prediction
General Methods for Prediction Model relationships between cancer incidence and relevant covariate data in HSAs of study area, then apply model to HSAs with missing data to predict #s of cases
Initial Model Covariates (for CFF) Year Mortality Demographic characteristics (age, race, sex) Geographic definition (Census Division, lat/long) Density of medical facilities Ethnicity/origin Household characteristics Urban/rural indicator Socioeconomic status Cancer screening behavior Health insurance Lifestyle (smoking, exercise, obesity) Interactions
Incidence Model Hierarchical model: Poisson spatio-temporal regression at HSA-level Log Rate: a linear function of intercept, function of age, mortality rate, SES and demographic covariates, lifestyle covariates Sources of variation in the rates: Fixed effects model: residual only Mixed effects model: residual + spatial + temporal (random effects)
Advantages of this regression model For each HSA/race/age/year, the model borrows information from: Nearby HSAs (spatial autocorrelation) Previous year in the same HSA (temporal autocorrelation) Other HSAs similar with respect to the covariates More correctly accounts for correlation structure More reliable estimates
A multi-step forecasting process Available data
A multi-step forecasting process Available data 1995 2001 2006
A multi-step forecasting process Available data Spatiotemporal prediction 1995 2001 2006
A multi-step forecasting process Available data Spatiotemporal prediction 1995 2001 Model estimates 2006
A multi-step forecasting process Available data Spatiotemporal prediction 1995 2001 Model estimates Temporal projection 2006
A multi-step forecasting process Available data Spatiotemporal prediction 1995 2001 Model estimates Temporal projection Projected to 2010 2006
Overall goodness-of-fit statistics Cancer Sex % std resid >2 % std resid >3 Gen. Chisq/df All M 0.91% 0.12% 1.05 All F 0.88% 0.10% 1.01 Brain M 0.34% 0.035% 0.92 Brain F 0.31% 0.033% 1.01 Leukemia M 0.40% 0.047% 1.56 Leukemia F 0.33% 0.032% 0.87
Overall goodness-of-fit statistics Cancer Sex % std resid >2 % std resid >3 Gen. Chisq/df All M 0.91% 0.12% 1.05 All F 0.88% 0.10% 1.01 Brain M 0.34% 0.035% 0.92 Brain F 0.31% 0.033% 1.01 Leukemia M 0.40% 0.047% 1.56 Leukemia F 0.33% 0.032% 0.87
Overall goodness-of-fit statistics Cancer Sex % std resid >2 % std resid >3 Gen. Chisq/df All M 0.91% 0.12% 1.05 All F 0.88% 0.10% 1.01 Brain M 0.34% 0.035% 0.92 Brain F 0.31% 0.033% 1.01 Leukemia M 0.40% 0.047% 1.56 Leukemia F 0.33% 0.032% 0.87
Overall goodness-of-fit statistics Cancer Sex % std resid >2 % std resid >3 Gen. Chisq/df All M 0.91% 0.12% 1.05 All F 0.88% 0.10% 1.01 Brain M 0.34% 0.035% 0.92 Brain F 0.31% 0.033% 1.01 Leukemia M 0.40% 0.047% 1.56 Leukemia F 0.33% 0.032% 0.87
Female Childhood Cancer # New Cases, age 0-19
Female Childhood Cancer # New Cases, age 0-19
Female Childhood Cancer # New Cases, age 0-19
HSA level childhood cancer (0-19) incidence rate, per 100,000 persons Female Male
HSA level childhood leukemia incidence rate, per 100,000 persons Female Male
HSA level childhood brain cancer incidence rate, per 100,000 persons Female Male
Temporal trend of delay adjusted rate estimates for male childhood leukemia, one HSA in the study 3.15 3.05 2.95 2.85 2.75 2.65 2.55 2.45 2.35 1995 1997 1999 2001 2003 2005 2007 2009 2011 Rate
Temporal trend of delay adjusted rate estimates for male childhood leukemia, one HSA in the study 3.15 3.05 2.95 2.85 2.75 2.65 Rate JP1 2.55 2.45 2.35 1995 1997 1999 2001 2003 2005 2007 2009 2011
Temporal trend of delay adjusted rate estimates for male childhood leukemia, one HSA in the study 3.15 3.05 2.95 2.85 2.75 2.65 2.55 2.45 2.35 1995 1997 1999 2001 2003 2005 2007 2009 2011 Rate JP1 JP2
Temporal trend of delay adjusted rate estimates for male childhood leukemia, one HSA in the study 3.15 3.05 2.95 2.85 2.75 2.65 2.55 2.45 2.35 1995 1997 1999 2001 2003 2005 2007 2009 2011 Rate JP1 JP2 Projection
Back to the goal of project Predict # new childhood cancer cases in each data year, and project expected #s ahead to current year Model estimates: by race, age groups, site, state CFF estimates a total of 10,700 new cases (age 0-14) in 2010: approximation at US level, all childhood cancer combined. Model estimates: 4617 female and 5423 male new cases (age 0-14) in 2010, total 10,040 new cases. Not only available US level, but by race, age, sites, state
Conclusions Overall, ecologic model predicts observed counts & rates very well in SEER & NPCR registries Inclusion of extra components of variance makes a difference in predicted counts, rates, variances More reliable prediction
Future Work Issues to explore Other spatial/temporal autocorrelation structure? Other covariates? (Birth defect, low birth weight) How finely can Cancer Sites be split? Minimum # cases? Explore Bayesian hierarchical modeling
Contact Info Li Zhu Surveillance Research Program Division of Cancer Control and Population Sciences National Cancer Institute Li.Zhu@nih.gov 301-594-6546 (phone)
Contact Info Li Zhu Surveillance Research Program Division of Cancer Control and Population Sciences National Cancer Institute Li.Zhu@nih.gov 301-594-6546 (phone) Merci beaucoup!