Main objective of Epidemiology. Statistical Inference. Statistical Inference: Example. Statistical Inference: Example

Main objective of Epidemiology Inference to a population Example: Treatment of hypertension: Research question (hypothesis): Is treatment A better than treatment B for patients with hypertension? Study design: Clinical study: select a sample; 143 female nurses (30-50 yrs of age, with hypertension), administer randomly 2 treatments A and B. Statistical analysis on data. Interference on population Statistical Inference Study sample: 143 female nurses (30-50 years of age, with hypertension. Target population: female nurses (30-50 years of age, with hypertension). External population: males, females, all age groups, with hypertension. External population Target Study Sample Statistical Inference: Example Evaluate the performance of a drug before its release?? External population Target Study Sample Statistical Inference: Example Evaluate the performance of a drug before its release?? Based on the new drug s performance among a sample of patients with the disease, a conclusion is drawn regarding the drug s performance were it will be used among a population of patients with the disease. Concerns in Epidemiologic studies Concerns in Epidemiologic studies I. Internal validity: Was the study carefully designed and analyzed? I. Internal validity: Was the study carefully designed and analyzed? II. External validity (Generalizability): Are the results applicable to the external population? II. External validity (Generalizability): Are the results applicable to the external population? III. Reproducibility of the results (precision): Are the results reproducible? III. Reproducibility of the results (precision): Are the results reproducible?

Internal validity Internal validity The extent to which the analytic inference derived from study sample is correct for the target population. External population Target Study Sample The extent to which the analytic inference derived from study sample is correct for the target population 1. Sample selection bias 2. Information bias 3. Confounding bias 4. Reverse causality Sample selection bias Sample selection bias Sample should be representative of the target population. If the relation between the exposure and the outcome is different for those who do participate and those who would be eligible but do not participate, then we have selection bias. I. Referral Bias: if the sample is based from a referral of a specific community Referral Bias OC Thromboembolism Sample selection bias Thrumbo. Thrumbo. Hospital OC OC Total OC 20 80 100 Thrumbo. 50 15 85 100 Thrumbo. 15 OR=1.4 II. Self Selection Bias: subjects self select themselves into a study OC 50 85 Total 100 100 OR=5.7

Sample selection bias III. Healthy Worker Effect (HWE): healthy workers remain for the study (un-employed, retired are less healthy) Healthy Worker Effect Lifting heavy weight pain F-up Forestry workers F-up General population Back BP BP BP BP Results showed NO association :WHY??? Sample selection bias IV Loss to follow up: in Cohort studies Dust Chronic Lung Disease CLD CLD Total Dust 1,000 (10%) 9,000 (1%) 10,000 Dust 500 (2%) 9,500 (0%) 10,000 RR=2 Sample selection bias Loss to follow up: in Cohort studies Dust Chronic Lung Disease CLD CLD Total Dust 900 8,910 9,810 Dust 490 9,500 9,990 RR=1.8 Minimizing Selection Bias Information Bias (Misclassification Bias) To minimize selection bias: Select a representative sample of the target population. Errors in the classifying of the exposure or outcome.

Misclassification of Exposure Misclassification of Exposure Example: Recall bias Interviewers bias Error during data entry If misclassification of exposure is related to outcome: Differential Misclassification If misclassification of exposure is not related to outcome: Non-Differential Misclassification Misclassification of Outcome Example Diagnostic bias If misclassification of outcome is related to exposure: differential misclassification If misclassification of outcome is not related to exposure: non-differential misclassification Minimize Information Bias To minimize information bias: Standardize detection procedures Blind subjects and observers Confounding Bias: Example 1 Example 1: Does gambling cause cancer? Confounding Bias: Example 2 Example 2: TIME Magazine 1978: Sonic Doom- Can jet noise kill? Nevada state (gambling is legal) Utah state (gambling is illegal) Random sample Matched on age, sex, residence, family income Random sample Matched on age, sex, residence, family income 10 years Follow-up 10 years Follow-up If we eliminate gambling, prevent 86,000 cancer deaths a year???? The study was conducted in Los- Angeles. The study concluded: People living in areas where jet plane noise >90 decibels had a significant increased death rate. WHY???

Confounding Bias: Definition Confounding Bias: Definition Is present when the association between an exposure and an outcome is distorted by an extraneous third variable (referred to a confounding variable). For a variable to be a confounder, it must be: 1. A risk factor the disease 2. Associated with the exposure being studied. Confounding Bias: Example 3 Example 3: Study the association between coffee drinking and lung cancer LC Yes No Yes 80 15 No 20 85 Coffee OR= (80x 85)/ (15 x 20)= 22 Confounding Bias: Example 3 Is smoking a confounding variable? Coffee Lung cancer Smoking Smoking is a confounding variable. What would you conclude???? Confounding Bias: Minimize bias Research Design: Use of randomized clinical trial Restriction Matching Data Analysis: Stratification Multivariate statistical techniques Example 1: Association between place of residence & Chronic bronchitis Residence Chronic bronchitis Yes No Urban 194 2219 Rural 69 1208 RR= 1.48

Example 1: Stratified by Smoking: Smokers Non-smokers Bronchitis Bronchitis Residence Yes No Yes No Urban 167 1094 27 1125 Rural 53 417 16 791 Example 1: But 1.18 1.17 = 1.48 (crude) Smoking is a confounding variable! RR=1.18 RR=1.17 Example 2: Study the association between Alcohol consumption & Myocardial Infarction Alcohol MI Yes No Yes 71 52 OR=2.2 No 29 48 Example 2: Stratified by smoking: Smokers Non-smokers Alcohol MI No MI MI No MI Yes 8 16 63 36 No 22 44 7 4 OR=1 OR=1 Example 2: But 1 = 1 = 2.2 (crude) Smoking is a confounding variable! Example 3: Study the association between Treatment 1 & Treatment 2 with survival Alive Dead Total T1 40 60 100 T2 60 40 100 Total 100 100 200 RR (Crude) = 0.67

Example 3: Stratified by gender: Females Males Alive Dead Alive Dead T1 24 3 16 57 T2 58 30 2 10 RR=1.35 RR=1.32 Example 3: But 1.35 1.32 = 0.67 (crude) Gender is a confounding variable! Mantel-Haenzel: Calculating an adjusted RR i th stratum: Outcome No outcome Total E ai bi No E ci di Total Ni RR MH = ai(ci + di) /Ni ci(ai + bi) /Ni Mantel-Haenzel: Example Example: Females Males Success Failure Success Failure T1 24 3 16 57 T2 58 30 2 10 RR= 1.35 RR= 1.32 RR Adjusted = RR MH = 24(58+30)/115 + 16 (2+10)/85 58(24+3)/115 + 2 (16+57)/85 = 1.34 Concerns in Epidemiologic studies I. Internal validity: Was the study carefully designed and analyzed? II. III. External validity (Generalizability): Are the results applicable to the external population? Reproducibility of the results (precision): Are the results reproducible? External validity Also Known as generalizability. The extent to which the analytic inference derived from study sample is correct for the external population. External population Target Study Sample

External validity Concerns in Epidemiologic studies External validity is dependent on internal validity. I. Internal validity: Was the study carefully designed and analyzed? Maximize external validity by selecting study subject from a target population as similar as possible to the external population. II. III. External validity (Generalizability): Are the results applicable to the external population? Reproducibility of the results (precision): Are the results reproducible? Reproducibility Subjects in a study are always a sample of a population. Repetitive sampling results in a range of estimates for different samples: Sampling variation. The smaller the sample, the less reproducible will be the sample estimate. To decrease sampling error, increase sample size. Reproducibility Sample 1 Target Sample 2 Sample 3 Study sample Example 1: Association between severe injury & death. Death No Death Severe injury 45 55 No Severe injury 6 94 RR = 0.45/0.06 = 7.5 Example 1: Stratified by age: <65 years > 65years Death No Death No Death Death Severe injury 15 35 30 20 No Severe injury 2 28 4 66 RR= 4.48 RR= 10.5

Example 1: 4.48 = 10.5 = 7.5 (crude) Age is an effect modifier! When the exposure- outcome relationship is different for the different levels of a third variable, we have interaction (effect modification). The crude RR is hiding important effects. Solution: Report the RR separately for each category of the variable (for the different levels of the effect modifier). Example 2: D No D Total E 200 1800 2000 No E 400 3600 4000 Total 600 5400 6000 RR= (200/2000) / (400/4000) = 1 Example 2: 1.69 = 3 = 1 (crude) Female Males D No D Total D No D Total E 110 390 500 90 1410 1500 No E 380 2620 3000 20 980 1000 Total 490 3101 3500 110 2390 2500 RR= 1.69 RR= 3 Gender is an effect modifier! A Confounder or Effect modifiers? A Confounder or Effect modifiers? Males RR=2 Hip Fracture Obesity RR=1.6 Breast cancer Young Old Males Hip Fracture Males Hip Fracture RR= 0.7 RR= 3 Pre-menopausal Post-menopausal Obesity Breast cancer Obesity Breast cancer RR= 1.1 RR= 2

A Confounder or Effect modifiers? Smoking RR=3 Low birth weight (LBW) Young Old Smoking LBW Smoking LBW RR= 2 RR= 4 Example 2: A cohort study was conducted to evaluate the association between air pollution and a specific lung disease. From the information below would you conclude that gender is a confounder or an effect modifier??? D No D Total E 200 1800 2000 No E 400 3600 4000 Total 600 5400 6000 3500 individuals were females 500 females were exposed 490 females had the disease 110 females had disease & were exposed 1410 males were exposed & did not have the disease 20 males were not exposed & had the disease T (True) or F (False): To ensure that the study is internally valid we need to check for the 4 main biases: Selection bias, information bias, confounding and effect modification. T (True) or F (False): Generalizability is the extent to which the inference derived from the study sample is correct for the target population. External validity depends on If you increase the sample size you increase

T (True) or F (False): Blinding the interviewer minimizes observation bias. T (True) or F (False): Recall bias is common in case control studies while loss to follow up is common in randomized clinical trials. A cohort study is planned to investigate the association between maternal alcohol consumption during pregnancy and fetal alcohol Syndrome (a disease that is difficult to diagnose) in newborn children. What are the possible biases? In a closed cohort study, most of those who were exposed and developed the disease died before the end of the study. What will happen to the RR? What are the characteristics of a confounder? T (True) or F (False): The crude relative risk for smoking and myocardial infarction was 2.8. When stratified by gender, the relative risk for males was 5.6 and 1.5 for females. The study investigators concluded that results are biased by gender which is an effect modifier and hence they adjusted for gender in the analysis.