Confounding Confounding and effect modification Epidemiology 511 W. A. Kukull vember 23 2004 A function of the complex interrelationships between various exposures and disease. Occurs when the disease - exposure association under study is mixed with the effect of another factor Example (after Rothman, 1998) Is frequent beer consumption is associated with rectal cancer Beer consumption is associated with consumption of pizza Is pizza consumption a confounder Is pizza, by itself, causally associated with Ca if yes, then its a confounder; otherwise not Beer and Rectal Ca Rectal Ca control Beer 630 770 Beer 770 630 OR= 0.67 (0.58-0.78) Beer Beer Pizza consumption Rectal Ca Control Rectal Ca Control 350 700 70 280 280 70 700 350 Confounding (after Rothman, 1998) Confounding factor must be risk factor for disease (causally associated) Confounding factor must be associated with exposure in the source (study) population Confounding factor must not be affected by exposure or disease it cannot be the result of exposure it cannot be an intermediate step in causal path 1
Confounding (after Koepsell & Weiss, 2003) A factor that occurs only as a consequence of the exposure cannot distort (confound) the diseaseexposure association. To be a confounder, the factor would have to give rise to the exposure or be associated with something that did. matter how strongly a variable is related to exposure status, if it is not also related to the occurrence of the disease in question, it cannot be a confounder. Confounder Exposure Disease: some finer distinctions (Koepsell & Weiss, 2003) A confounder can be an actual cause of disease. A confounder can be associated with a cause of disease that, in the context of the study, cannot be measured. (e.g., genotype) A variable can be a confounder if it is related to the recognition of the disease even if it has no relationship to the actual occurrence of disease. (e.g., frequency of screening tests for disease) Confounding Exposure Country (exp) Confounder Disease Age Distribution (conf) Mortality = non causal = causal Sexual Activity (exp) Ca Channel Blockers (exp) General Health (conf) Other Meds (conf) Mortality GI Bleeding 2
Vitamin C Intake (exp) Low Fat Diet (exp) Diet, SES Lifestyle (conf) Cholesterol (conf ) Colon Cancer Heart disease Consequence of exposure Smoking Quetelet Index Weight Loss (conf ) Abdominal skinfold (conf ) Lung Ca Type II Diabetes Consequence of disease Skinfold is a surrogate measure of body mass Tax Id Number (conf ) Red Meat diet Colon Ca plausible association with disease Confounder or consequence Studying decreased risk of MI and due to moderate alcohol consumption Higher HDL cholesterol is independently associated with lower risk of MI HDL increases as a result of moderate alcohol use Is HDL a Confounder 3
Controlling confounding in the design of a study Randomization: ensures known and unknown confounders are evenly distributed in study groups Restriction: Limit subjects to one category of a confounder e.g. if sex confounds, use only men; Matching: equalize groups on confounder (must follow matched analysis) Evaluating Confounder disease and exposure Construct tables for confounder and disease confounder and exposure Examine odds ratios (or effect estimate) are the associations strong are they likely to be causal Stratification in analysis: adjusting for confounding Computing the crude OR from a 2x2 table Stratification breaks the crude table into separate 2x2 tables for each level of the confounding factor analogous to standardization many factors and many levels can cause tables with empty cells Is there Confounding Do stratum specific RR estimates differ from Crude estimate Does adjusted RR estimate differ from Crude estimate Mantel-Haenszel Multivariate modeling differences of >10% in RR when factor is included in the model, indicate confounding present Confounding in stratified analyses stratify by the potential confounder compute stratum-specific OR estimates If uniform but different from crude OR then confounding is probably present: calculate adj. OR (e.g., use Mantel-Haenszel) If NOT uniform across strata then effect modification (interaction) may be present Report stratum specific estimates; do not adjust Is toluene exposure associated with Diabetes Exposed to Toluene t Exposed Crude OR = 1.95 (1.0-3.8) Diabetes CTRL 30 18 70 82 4
Does the Age confound the diabetes toluene association <40 > 40 diabetes ctrl diabetes ctrl Why Age confounds because it is associated with diabetes, regardless of toluene exposure Toluene exposed Toluene Diab Ctrl Diab Ctrl Tolu. 5 8 Tolu. 25 10 >40 25 10 >40 25 10 t 45 72 t 25 10 <40 5 8 <40 45 72 OR(1) = 1.0 (0.3-3.1) OR(2) = 1.0 (0.4-2.8) OR = 4.0 (1.1-14.7) OR = 4.0 (1.8-9.0) Stratification example 1 Crude OR = 1.95 OR in each age group is 1.0 when the strata OR s are the roughly equal -- but different from the Crude OR-- it indicates confounding Age is a confounder We should adjust for Age in the analysis Mantel-Haenszel adjusted OR (you will not need to memorize the formula) ETOH and MI MI MI Alcohol 71 52 29 48 OR= 2.26 {1.26-4.04} ETOH Stratify by smoking non smokers smokers MI Ctrl MI Ctrl 8 16 22 44 63 36 7 4 P. A. Physical Activity and Stroke Stroke Stroke High 190 266 OR=1.0 (0.38-2.65) OR = 1.0 (0.29-3.45) Low 176 OR= 0.64 {0.48-0.85} 157 5
P.A. Hi Lo Stratify by Gender Men Women Stroke Ctrl Stroke Ctrl 141 208 144 OR= 0.53 (0.38-0.73) 112 Hi Lo 49 58 32 45 OR = 1.19 (0.65-2.16) Controlling Confounding in the Analysis: Adjusted odds ratio Stratified analysis (examine strata OR) Mantel-Haenszel adjusted OR : a weighted average of stratum specific OR s Σ (ad / N) divided by Σ (bc / N) = OR mh Where N= total subjects in each sub table a c b d N 1 c d a b N 2 Mantel-Haenszel Adjusted OR Trisomy 21 and spermicide use: Case-Control Study Down s Ctrl ^ OR mh = (a 1 d 1 )/N 1 + (a 2 d 2 )/N 2 +... (b 1 c 1 )/N 1 + (b 2 c 2 )/N 2 +... Sp + Sp - 4 109 12 1145 1270 OR= Sp + Sp - Stratify by Maternal Age <35 35+ Down Ctrl Down Ctrl OR= 3 9 104 1059 Sp + Sp - 1 3 OR= 5 86 1175 95 ^ OR mh = = Mantel-Haenszel Adjusted OR (a 1 d 1 )/N 1 + (a 2 d 2 )/N 2 +... (b 1 c 1 )/N 1 + (b 2 c 2 )/N 2 +... [(3)(1059) / (1175)] + [(1)(86) / (95)] [(9)(104) / (1175)] + [(3)(5) / (95)] = 3.8 6
Multivariate Statistics Linear: y = b 0 + b 1 x 1 + b 2 x 2 +... b k x k Logistic: exp (b) gives you adjusted OR log(odds) = b 0 + b 1 x 1 + b 2 x 2 +... b k x k for b 1 coded as a [0,1] variable, the OR x1 = e b1 (adjusted for all other x i ) Cox : exp (b) gives you adjusted RR log(haz) = b 0 + b 1 x 1 + b 2 x 2 +... b k x k Logistic Regression Coding Variables Continuous x causes b to be interpreted as: increase in log odds per unit change in x Interaction of two variables is represented by a single product term: x 1 x 2 (with only one b) interpretation of models which include interaction and continuous terms can be tricky Consult a friendly Biostatistician Recognizing Confounding in logistic regression models Logistic Regression: ln[y/(1-y)] = a + b 1 X 1 + b 2 X 2 + b n X n e (b i) = OR (xi) (per unit change in X i ) does b xi change when X k factor(s) are added Does crude OR differ from adjusted OR does model log-likelihood change (score test) Logistic coefficients and OR s Variable (x) Coefficient (b) Odds Ratio intercept -4.56 ---- 1.31 3.71 gender (1=m,0=F) smoking (1=yes,0=no) HTN (1=yes,0=no) 0.70 2.01 0.51 1.67 e b = OR Interaction (Effect Modification) Statistical, Biological and Social semantic meanings differ. Does the RR estimate differ at each level of a third variable Homogeneity of RR Biological reasoning: is there something about the third factor that changes the way the Exposure-Disease association works Hepatitis C Virus infection Stratification Example: Crude table Hepatocellular carcinoma Case Control 63 102 24 357 Crude OR = 9.2 (5.5-15.4) 7
HepC+ - Stratify by HBV infection Are the stratum specific odds ratios statistically different HBV+ HBV- Case Ctrl Case Ctrl 37 40 1 28 HepC+ - 26 62 23 329 OR(1)= 25.9 (4.2 - * ) OR(2)= 6.0 (3.2-11.1) ORs are not statistically different: should we adjust or report strata ORs M-H adjusted odds ratio OR= 8.1 Stratification Example 2: HBV, HepC and Liver Ca The OR s in the HBV strata look quite different Does this indicate effect modification Effect modification is a finding in the data that needs to be elaborated; it is a natural phenomena that exists independently Confounding is a nuisance that needs to be eliminated (by adjusting, matching, restriction, etc.) Effect Modification (also known as interaction ) When the measure of effect differs between strata Can apply to RR or risk difference (AR) measures Presumed additive or multiplicative effect model depends on biology of disease and factor Synergy: when effect exceeds that expected under the chosen model RR (A+B) >> RR (A) + RR (B) RR (A x B) >> RR (A) x RR (B) Schematic of additive model for case control data (Szklo & Nieto, 2000) 7.0 Additive model effects: Expected = OR(A) + OR(Z) - 1.0 Excess joint 4.0 increase 3.0 A A 2.0 OR=1.0 A Z Z Z BL BL BL BL BL Expected Observed RR estimates in strata: guidelines for heterogeneity[szklo & Nieto 2000] Suspected E-M factor absent Suspected E-M factor present Adjust or report strata RR s 2.3 2.6 Adjust 2.0 20.0 Report 0.5 3.0 Report (qualitative diff) 3.0 4.5 Maybe both Is there an association between risk factor (X) and disease (Y) YES Is it affected by Bias Are STRATUM RR s different from crude RR confounding by Strata factor Stratified analysis flow chart YES Estimate magnitude and direction of effect on RR Stratum RRs are similar to each other: Confounding: Adjust for stratum factor Stratum RRs are statistically different from each other: Interaction/effect modification report strata RRs, don t adjust 8
Considerations Collect data on potential confounders if you don t get it you can t control for it Try to reason what the potential effect of confounding might be Magnitude and direction (as with bias) Coffee drinking and MI: smoking may be a positive confounder because smokers are at increased risk of MI Generally speaking... A strong association is less likely to be explained by confounding than a weak one For an observed association to be the sole result of confounding by another factor: the factor must have a stronger association with disease than the one observed if RR= 10.0 for smoking and Lung ca, then a confounder would need RR> 10.0 Logistic Regression Allows simultaneous adjustment for several confounders (also allows interactions ) multiple variables to predict disease status (dichotomous outcome) Odds ratios can be obtained directly from the regression coefficients Standard method seen in most casecontrol study analyses (matched and unmatched analyses) Conclusion What is confounding How do we recognize, evaluate and control it What is effect modification How do we recognize and evaluate it Why is it important [also know as interaction, effect measure modification, etc.] 9