Multiple For General Models -with Application to Explore Racial Disparity in Breast Cancer Survival Qingzhao Yu Joint Work with Ms. Ying Fan and Dr. Xiaocheng Wu Louisiana Tumor Registry, LSUHSC June 5th, 2012 NAACCR Annual Conference 1 / 42
Outline 1 2 3 4 5 2 / 42
Some facts about female breast cancer: The most common cancer and the second leading cause of cancer death among American women. 3 / 42
Some facts about female breast cancer: The most common cancer and the second leading cause of cancer death among American women. Significant racial disparity in mortality between Whites and African American females. http://apps.nccd.cdc.gov/uscs/cancersbyraceandethnicity.aspx 4 / 42
Question: How to efficiently reduce the racial disparity in female breast cancer survival rate? Goal: Explore the racial disparity in breast cancer survival. 5 / 42
Racial disparity in breast cancer survival race Breast cancer survival rate 6 / 42
Racial disparity in breast cancer survival Age at diagnosis race SES Insurance Marital status Stage at diagnosis Breast cancer survival rate Tumor grade Treatment ER/PR receptors... 7 / 42
Racial disparity in breast cancer survival Age at diagnosis race SES Insurance Marital status Stage at diagnosis Breast cancer survival rate Tumor grade Treatment ER/PR receptors... 8 / 42
Racial disparity in breast cancer survival Age at diagnosis race SES Insurance Marital status Stage at diagnosis Breast cancer survival rate Tumor grade Treatment ER/PR receptors... 9 / 42
Racial disparity in breast cancer survival Age at diagnosis race SES Insurance Marital status Stage at diagnosis Breast cancer survival rate Tumor grade Treatment ER/PR receptors... 10 / 42
The concept of mediation Definition effect refers to the effect transmitted by an intervening variable to an observed relationship between a predictor and dependent variable of interest. 11 / 42
The concept of mediation Definition effect refers to the effect transmitted by an intervening variable to an observed relationship between a predictor and dependent variable of interest. Application in Disciplines: Social Science Prevention Studies Behavior Research Epidemiological studies Genetics Epidemiology 12 / 42
Linear regression models X X M α 1 β 1 c 1 c 2 Figure: Diagram M = αx + ɛ 1 ; (1) Y = βm + c 1 X + ɛ 2 ; (2) Y = c 2 X + ɛ 3.(3) Y Y 13 / 42
Coefficients difference and product methods Coefficients difference method: c 2 c 1 14 / 42
Coefficients difference and product methods Coefficients difference method: c 2 c 1 Limitation: Model (2) and (3) are assumed to be true simultaneously; For binary response with logistic regression: scales for coefficients are different when different subsets of variables are used as predictors; Multiple mediators: cannot differentiate mediation effects from multiple mediators. 15 / 42
Coefficients difference and product methods Coefficients difference method: c 2 c 1 Limitation: Model (2) and (3) are assumed to be true simultaneously; For binary response with logistic regression: scales for coefficients are different when different subsets of variables are used as predictors; Multiple mediators: cannot differentiate mediation effects from multiple mediators. Coefficients product method: α β Limitation: Hard to explain when the predictive model is not linear regression. 16 / 42
Coefficients difference and product methods Coefficients difference method: c 2 c 1 Limitation: Model (2) and (3) are assumed to be true simultaneously; For binary response with logistic regression: scales for coefficients are different when different subsets of variables are used as predictors; Multiple mediators: cannot differentiate mediation effects from multiple mediators. Coefficients product method: α β Limitation: Hard to explain when the predictive model is not linear regression. Property: When Y and M are continuous and linear regression models are fitted for the relationships, c 2 c 1 = α β. 17 / 42
Counterfactual framework Donald B. Rubin (1974) X (i): treatment for subject i, control(x (i) = 0) and treatment (X (i) = 1). Y X (i): potential post-treatment outcome if subject i is treated with X (0 or 1). Usually, only one of the responses, Y 1(i) and Y 0(i), is observed. Y 1(i) Y 0(i): causal effect of treatment on the response variable for subject i. E(Y 1) E(Y 0): average causal effect M X (i): potential M when subject i is exposed to treatment X. M 1(i) or M 0(i) is observed if subject i is actually assigned to treatment or control group. 18 / 42
Counterfactual framework The potential outcome depends not only on the exposure variable but also on the mediator. Y x,m (i): potential outcome of subject i for given x and m. Y 0,m0 (i); Y 0,m1 (i), Y 1,m0 (i), Y 1,m1 (i) E(Y 1,m1 Y 0,m0 ): Total Effect E(Y 1,m0 Y 0,m0 ): Natural Direct Effect(Pearl, 2001) E(Y 1,m1 Y 0,m1 ): alternative definition E(Y 1,m1 Y 1,m0 ) E(Y 0,m1 Y 0,m0 ): Indirect Effect 19 / 42
Counterfactual framework The potential outcome depends not only on the exposure variable but also on the mediator. Y x,m (i): potential outcome of subject i for given x and m. Y 0,m0 (i); Y 0,m1 (i), Y 1,m0 (i), Y 1,m1 (i) E(Y 1,m1 Y 0,m0 ): Total Effect E(Y 1,m0 Y 0,m0 ): Natural Direct Effect(Pearl, 2001) E(Y 1,m1 Y 0,m1 ): alternative definition E(Y 1,m1 Y 1,m0 ) E(Y 0,m1 Y 0,m0 ): Indirect Effect Limitation: Assumption: E(Y 1,m0 Y 0,m0 ) = E(Y 1,m1 Y 0,m1 ) Difficult to differentiate indirect effects from multiple mediators. 20 / 42
Counterfactual framework The potential outcome depends not only on the exposure variable but also on the mediator. Y x,m (i): potential outcome of subject i for given x and m. Y 0,m0 (i); Y 0,m1 (i), Y 1,m0 (i), Y 1,m1 (i) E(Y 1,m1 Y 0,m0 ): Total Effect E(Y 1,m0 Y 0,m0 ): Natural Direct Effect(Pearl, 2001) E(Y 1,m1 Y 0,m1 ): alternative definition E(Y 1,m1 Y 1,m0 ) E(Y 0,m1 Y 0,m0 ): Indirect Effect Limitation: Assumption: E(Y 1,m0 Y 0,m0 ) = E(Y 1,m1 Y 0,m1 ) Difficult to differentiate indirect effects from multiple mediators. Common limitation: Only suitable for binary exposure variable 21 / 42
Challenge Recall the motivating example Challenge Various types of mediators Differentiate indirect effect from each mediator Compare the indirect effects conveyed by mediators that contribute to the racial disparity Potential nonlinear relationship and interactions among X, Ms, and Y 22 / 42
Notations and Definitions Notations Definitions: Total effect X M 1 M 2 M p Figure: Multiple Mediators Diagram Y Z 23 / 42
Notations and Definitions Notations Definitions: Total effect X M 1 M 2 M p Figure: Multiple Mediators Diagram Y Z Direct effect not from M 1 24 / 42
Notations and Definitions Notations Definitions: Total effect X M 1 M 2 M p Figure: Multiple Mediators Diagram Y Z Direct effect not from M 1 Indirect effect from M 1 25 / 42
Properties Under the linear regressions setting, we get the same results as the product method. 26 / 42
Properties Under the linear regressions setting, we get the same results as the product method. In logistic regression with binary mediator,recall the natural direct effect: ζ(0) = E(Y 1,m0 Y 0,m0 ) or ζ(1) = E(Y 1,m1 Y 0,m1 )b The relationship between direct effect and natural direct effect: DE=P(X = 0) ζ(0) + P(X = 1) ζ(1) 27 / 42
To measure uncertainties Delta 28 / 42
To measure uncertainties Delta Bootstrap 1 Randomly draw a sample of n observations from original data of size N with replacement; 2 Estimate DE, IE, and TE; 3 Repeat last two steps B times. Obtain a set of estimates for each quantity; 4 Obtain empirical variances of mediation effects and α th and 2 (1 α )th percentiles. 2 29 / 42
LA Breast Cancer Data 1473 non-hispanic White or African American female patients diagnosed with malignant breast cancer in 2004 in LA collected by Louisiana Tumor Registry. Followed up for five years. 30 / 42
LA Breast Cancer Data 1473 non-hispanic White or African American female patients diagnosed with malignant breast cancer in 2004 in LA collected by Louisiana Tumor Registry. Followed up for five years. Exclude: Lost follow up within three years (20, 1.4%); Death due to causes other than breast cancer (157, 10.7%). 1293 patients were included. 31 / 42
LA Breast Cancer Data 1473 non-hispanic White or African American female patients diagnosed with malignant breast cancer in 2004 in LA collected by Louisiana Tumor Registry. Followed up for five years. Exclude: Lost follow up within three years (20, 1.4%); Death due to causes other than breast cancer (157, 10.7%). 1293 patients were included. The odds of dying within three years for blacks is significantly higher than whites (173 death, OR=2.03, CI:[1.47, 2.81]) 32 / 42
Variable Description(1) Explanatory variable: racial indicator Response variable: alive (0) or Dead (1) at the end of the 3rd year of diagnosis Third variable considered: Census Track SES variables - poverty ( 20% versus < 20% of persons with an income below the federal poverty level) education ( 25% versus < 25% of adults (25 years and older) with less than a high school education) residence area (grouped using Beale codes: 100% rural; urban-rural mix; 100% urban) workclass ( 66% of persons ages 16 and over who are unemployed versus < 66%) insurance (no insurance; Medicaid; Medicare and public; private insurance) marital status (single-never married; married; separated; widowed; divorced; unknown) age at diagnosis 33 / 42
Variable Description(2) stage (regional; distance; localized) grade (moderately differentiate; poorly/undifferentiated; well differentiate; unknown) tumor size (< 1cm; 1.1 2cm; 2.1 3cm; > 3cm; unknown) comorbidity (mild; moderate; severe; none; unknown) surgery (mastectomy, lumpectomy, no surgery) radiation (not administered; administered) chemotherapy (not administered, administered) hormonal therapy (not administered; administered) ER/PR receptor (either is positive; both are negative; unknown). 34 / 42
Variable Description(2) stage (regional; distance; localized) grade (moderately differentiate; poorly/undifferentiated; well differentiate; unknown) tumor size (< 1cm; 1.1 2cm; 2.1 3cm; > 3cm; unknown) comorbidity (mild; moderate; severe; none; unknown) surgery (mastectomy, lumpectomy, no surgery) radiation (not administered; administered) chemotherapy (not administered, administered) hormonal therapy (not administered; administered) ER/PR receptor (either is positive; both are negative; unknown). Eligibility for Mediator: Significantly associated with race; Significantly relates to vital status controlling for race. 35 / 42
Data Results Table: Indirect Effects(IE) and Relative Effects(RE) Mediator IE [1] 95% CI for IE [2] RE 95% CI for RE [2] Stage 0.276 (0.127,0.488) 28.14 (0.122,0.767) Insurance 0.275 (0.074,0.43) 28.05 (0.058,0.708) ER/PR 0.181 (0.068,0.332) 18.46 (0.055,0.56) Grade 0.158 (0.027,0.379) 16.09 (0.024,0.476) Surgery 0.145 (0.067,0.333) 14.75 (0.063,0.504) Tumor Size 0.135 (0.025,0.417) 13.77 (0.022,0.561) Hormonal Therapy 0.114 (0.02,0.253) 11.57 (0.016,0.402) Age -0.113 (-0.301,-0.034) -11.5 (-0.432,-0.031) Marital Status -0.074 (-0.392,0.156) -7.51 (-0.739,0.159) Comorbidity -0.002 (-0.15,0.109) -0.17 (-0.205,0.136) [1] After considering all indirect effects through mediators, direct effect of race on mortality is -0.115 with 95% CI:[-0.713,0.526]. [2] 95% confidence interval is 0.025 and 0.975 percentiles of the distribution of statistics obtained by bootstrap with 1000 repetitions. 36 / 42
Racial disparity in breast cancer survival race Age at diagnosis (-11.5%) Insurance (28.05%) Stage at diagnosis (28.14%) Tumor size (13.77%) Breast cancer survival rate Tumor grade (16.09%) Surgery (14.75%) Hormonal Therapy (11.57%) ER/PR receptors (18.46%) 37 / 42
Racial disparity in breast cancer survival race Age at diagnosis (-11.5%) Insurance (28.05%) Stage at diagnosis (28.14%) Tumor size (13.77%) -.116 (not significant) Breast cancer survival rate Tumor grade (16.09%) Surgery (14.75%) Hormonal Therapy (11.57%) ER/PR receptors (18.46%) 38 / 42
The proposed method can deal with continuous, binary or categorical variables We can separate mediation effects from each mediator With the proposed method, we can use predictive models other than linear regression models We will extend the method to survival analysis and multilevel analysis 39 / 42
Acknowledgment Louisiana Tumor Registry Dr. Xiaocheng Wu Ms. Ying Fan Dr. Vivien Chen Data used for this study was from the CDC-NPCR funded Breast and Prostate Cancer Data Quality and Patterns of Care Study (grant number: 1 U01 DP000253-01). 40 / 42
Imai Kosuke(2010a). A general approach to causal mediation analysis. Psychological s. Vol.15, No.4, 309-334. Imai Kosuke(2010b). Identification, Inference and sensitivity analysis of causal mediation effects. Statistical Science. Vol. 25, No. 1, 51-71. Baron RM, Kenny DA(1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J. Pers Soc Psychol. 51(6): 1173-1182. Alwin, D. F., Hauser, R. M. (1975). The decomposition of effects in path analysis. American Sociological 40: 37-47. Judd, C. M., Kenny, D. A. (1981). David P. MacKinnon, J. Dwyer(1995). Process : Estimating mediation in treatment evaluations. Evaluation. 5(5), 602-619. David P. MacKinnon, James H. Dwyer(1993). Estimating mediated effects in prevention studies. Evaluation review. Vol.17, No.2, 144-158. Donald B. Rubin(1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology. Vol.66, No.5, 688-701. Paul W. Holland(1986). Statistics and Causal inference. J. of the American Statistical Association. Vol.81, No.396, pp. 945-960. Pear J(2001). Direct and indirect effects. In: Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence. San Francisco, CA: Morgan Kaufmann; 2001: 411-420. Maya L. Petersen, Sandra E. Sinisi, and Mark J. van der Laan(2006). Estimation of direct causal effects. Epidemiology. Vol. 17, No.3. David MacKinnon, Jennifer Krull(2000). Equivalence of the mediation, confounding and suppression effect. Prevention Science. Vol.1, No.4. 41 / 42
Question 42 / 42