Bias. A systematic error (caused by the investigator or the subjects) that causes an incorrect (overor under-) estimate of an association.

Similar documents
Challenges of Observational and Retrospective Studies

Bias. Sam Bracebridge

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Main objective of Epidemiology. Statistical Inference. Statistical Inference: Example. Statistical Inference: Example

Bias. Zuber D. Mulla

Confounding and Bias

Biostatistics and Epidemiology Step 1 Sample Questions Set 1

Welcome to this series focused on sources of bias in epidemiologic studies. In this first module, I will provide a general overview of bias.

In the 1700s patients in charity hospitals sometimes slept two or more to a bed, regardless of diagnosis.

Module 2: Fundamentals of Epidemiology Issues of Interpretation. Slide 1: Introduction. Slide 2: Acknowledgements. Slide 3: Presentation Objectives

Epidemiology: Overview of Key Concepts and Study Design. Polly Marchbanks

Purpose. Study Designs. Objectives. Observational Studies. Analytic Studies

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Is There An Association?

Controlling Bias & Confounding

DECLARATION OF CONFLICT OF INTEREST

Epidemiology 2200b Lecture 3 (continued again) Review of sources of bias in Case-control, cohort and ecologic(al) studies

An Introduction to Epidemiology

Case-Control Studies

Rapid appraisal of the literature: Identifying study biases

Confounding and Interaction

University of Wollongong. Research Online. Australian Health Services Research Institute

Observational Study Designs. Review. Today. Measures of disease occurrence. Cohort Studies

Screening for Disease

Clinical Research Design and Conduction

Epidemiological study design. Paul Pharoah Department of Public Health and Primary Care

Biases in clinical research. Seungho Ryu, MD, PhD Kanguk Samsung Hospital, Sungkyunkwan University

Understanding Statistics for Research Staff!

Confounding Bias: Stratification

EPIDEMIOLOGY-BIOSTATISTICS EXAM Midterm 2004 PRINT YOUR LEGAL NAME:

Bias and confounding. Mads Kamper-Jørgensen, associate professor, Section of Social Medicine

Trial Designs. Professor Peter Cameron

Preventive Services Explained

Strategies for Data Analysis: Cohort and Case-control Studies

1 Case Control Studies

How do we know that smoking causes lung cancer?

Observational study II

Incorporating Clinical Information into the Label

Glossary of Practical Epidemiology Concepts

5 Bias and confounding

INTERPRETATION OF STUDY FINDINGS: PART I. Julie E. Buring, ScD Harvard School of Public Health Boston, MA

Chapter 2. Epidemiological and Toxicological Studies

Experimental Design. Terminology. Chusak Okascharoen, MD, PhD September 19 th, Experimental study Clinical trial Randomized controlled trial

Types of Biomedical Research

12/26/2013. Types of Biomedical Research. Clinical Research. 7Steps to do research INTRODUCTION & MEASUREMENT IN CLINICAL RESEARCH S T A T I S T I C

INTERNAL VALIDITY, BIAS AND CONFOUNDING

Case-control studies. Hans Wolff. Service d épidémiologie clinique Département de médecine communautaire. WHO- Postgraduate course 2007 CC studies

Data that can be classified as belonging to a distinct number of categories >>result in categorical responses. And this includes:

The Muscatine Study Heart Health Survey

Strategies for data analysis: case-control studies

Gender Analysis. Week 3

Blue Precision HMO Annual Health Assessment Form - Adult

12 CANCER Epidemiology Methodological considerations

Study Design STUDY DESIGN CASE SERIES AND CROSS-SECTIONAL STUDY DESIGN

<INSERT COUNTRY/SITE NAME> All Stroke Events

Causal Association : Cause To Effect. Dr. Akhilesh Bhargava MD, DHA, PGDHRM Prof. Community Medicine & Director-SIHFW, Jaipur

ADENIYI MOFOLUWAKE MPH APPLIED EPIDEMIOLOGY WEEK 5 CASE STUDY ASSIGNMENT APRIL

Collecting Data Example: Does aspirin prevent heart attacks?

RESEARCH. Katrina Wilcox Hagberg, 1 Hozefa A Divan, 2 Rebecca Persson, 1 J Curtis Nickel, 3 Susan S Jick 1. open access

Online Supplementary Material

Prognosis. VCU School of Medicine M1 Population Medicine Class

Supplementary Online Content

IN-NETWORK MEMBER PAYS. Out-of-Pocket Maximum (Includes a combination of deductible, copayments and coinsurance for health and pharmacy services)

Using Epidemiology to Identify the Cause of Disease: Cohort study

Two-sample Categorical data: Measuring association

What is a case control study? Tarani Chandola Social Statistics University of Manchester

Evidence-Based Medicine Journal Club. A Primer in Statistics, Study Design, and Epidemiology. August, 2013

Sampling. (James Madison University) January 9, / 13

General Biostatistics Concepts

Appendix G: Methodology checklist: the QUADAS tool for studies of diagnostic test accuracy 1

7/6/2012. University Pharmacy 5254 Anthony Wayne Drive Detroit, MI (313)

Chapter 1 - Sampling and Experimental Design

Designing Clinical Trials in Perioperative Sleep Medicine

Gathering. Useful Data. Chapter 3. Copyright 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Higher Psychology RESEARCH REVISION

Tool to Assess Risk of Bias in Cohort Studies

Gender Identity and Expression. Week 8

Please evaluate this material by clicking here:

Medical management of abdominal aortic aneurysms

Descriptive Research Methods. Depending on the type of question researchers want answered, will depend on the way they investigate it

Handout 1: Introduction to the Research Process and Study Design STAT 335 Fall 2016

Welcome to this third module in a three-part series focused on epidemiologic measures of association and impact.

Highlights. Attitudes and Behaviors Regarding Weight and Tobacco. A scientific random sample telephone survey of 956 citizens in. Athens-Clarke County

Part 1: Epidemiological terminology. Part 2: Epidemiological concepts. Participant s Names:

Understanding Preventive Care

Anssi Auvinen University of Tampere STUK Radiation and Nuclear Safety Authority International Agency for Research on Cancer

UNIT I SAMPLING AND EXPERIMENTATION: PLANNING AND CONDUCTING A STUDY (Chapter 4)

Health Studies 315. Clinical Epidemiology: Evidence of Risk and Harm

CLINICAL PRACTICE EVALUATION I: MEDICAL RECORD REVIEW (Adult Patient Population)

Abdominal aortic aneurysm screening: A decision for men aged 65 or over

3. Factors such as race, age, sex, and a person s physiological state are all considered determinants of disease. a. True

Observational Medical Studies. HRP 261 1/13/ am

SCENARIO 1: ICD-10-CM

Measures of Association

2. Studies of Cancer in Humans

Research Methodology Workshop. Study Type

A situation analysis of health services. Rachel Jewkes, MRC Gender & Health Research Unit, Pretoria, South Africa

Epidemiologic Methods and Counting Infections: The Basics of Surveillance

exposure/intervention

You can t fix by analysis what you bungled by design. Fancy analysis can t fix a poorly designed study.

Transcription:

Bias A systematic error (caused by the investigator or the subjects) that causes an incorrect (overor under-) estimate of an association. Here, random error is small, but systematic errors have led to an inaccurate estimate Also biased Precise, but biased True Effect 0 1.0 10 Risk Ratio, Rate Ratio, or Odds Ratio

Suppose we conducted a study multiple times in an identical way. Random Error Null True value Random Error and Bias Little random or systematic error Little random error, but biased

True Relationship Unbiased Sample Diseased Not Diseased Not 20,000 200,000 10,000 500,000 200 2000 100 5000 Diseased Not 175 2000 125 5000 Biased Estimates These samples are misleading about the true relationship in the population. Diseased Not 100 2100 100 5000

There are 4 mechanisms that can produce this distortion when samples are taken. D+ D- E+ E- Selection Bias When there is over- or undersampling that distorts the picture. Confounding When other factors affecting outcome are unevenly distributed between the comparison groups. Information Bias When there are errors in classification of exposure or outcome. Random Error

Selection Bias Occurs when selection, enrollment, or continued participation in a study is somehow dependent on the likelihood of having the exposure of interest or the outcome of interest. D+ D- E+ E- 0.3 1.0 2 3 Selection Bias Selection bias can cause an overestimate or underestimate of the association.

Selection bias can occur in several ways: 1. Selection of a comparison group ("controls") that is not representative of the population that produced the cases in a case-control study. (Control selection bias) 2. Differential loss to follow up in a cohort study, such that the likelihood of being lost to follow up is related to outcome status and exposure status. (Loss to follow-up bias) 3. Refusal, non-response, or agreement to participate that is related to the exposure and disease (Self-selection bias)

Selection Bias in a Case-Control Study Selection bias can occur in a case-control study if controls are more (or less) likely to be selected if they have the exposure. Do women of low SES have higher risk of cervical cancer? 200 Controls: Door-to-door survey of neighborhood around the hospital during work day. MGH 100 Hospital Cases

200 Controls: Door-to-door survey of neighborhood around the hospital during work day. Problems: 1. SE status of people living around the hospital may generally be different from that of the population that produced the cases. 2. The door-to-door method of selecting controls may tend to select people of lower SE status. Low SES Case Control High SES

Selection Bias in a Case-Control Study (More likely to select a control with low SES.) Exp. Y N Dis. Y N 75 100 25 100 Exp. Y N Dis. Y N 75 120 25 80 True OR=3.0 Control Selection Bias OR=2.0

Selection bias can occur in a case-control study when controls are more (or less) likely to be selected if they have the exposure. Selection bias is not caused by differences in other potential risk factors (confounding). It is caused by selecting controls who are more (or less) likely to have the exposure of interest.

Control Selection Bias - The Would Criterion Are the controls a representative sample of the population that produced the cases? If a control had developed cervical cancer, would she have been included in the case group? ( Would criterion) If the answer is not necessarily, then there is likely to be a problem with selection bias.

2,000,000 women > age 20 in MA, & about 200 cases of cervical cancer per year. If low SES were associated with cervical cancer with OR=3.0, MA would look like this. Entire Population Low SES (<median) High SES (>median) Cases are referred to MGH from all over, so their SES distribution is same as the state s, i.e. 3 to 1. Sample Cancer Cases Low SES 75 High SES 25 Normal 120 80 Cancer Normal Cases 150 1,000,000 50 1,000,000 But, controls selected from area around MGH may have lower SES than MA. OR = (75/25) = 2.0 (120/80) (Biased)

Are mothers of children with hemi-facial microsomia more often diabetic? Cases are referred, but what if controls are selected from the general pediatrics ward at MGH? Referred Cases The referral mechanism of controls might be very different from that of the cases with microsomia. Could mothers of controls be more or less likely to be diabetic than the cases (regardless of any association between diabetes and microsomia)? How would you select controls for this study?

Self- Selection Bias in a Case-Control Study Selection bias can be introduced into case-control studies with low response or participation rates if the likelihood of responding or participating is related to both the exposure and outcome. Example: A case-control study explored an association between family history of heart disease (exposure) and the presence of heart disease in subjects. Volunteers are recruited from an HMO. Subjects with heart disease may be more likely to participate if they have a family history of disease. Fam. Hx+ Case Control Fam. Hx-

Self-Selection Bias in a Case-Control Study Exp. Y N Diseased Y N 300 200 200 300 Exp. Y N Diseased Y N 240 120 (80%) (60%) 120 180 (60%) (60%) True OR=2.25 Self-Selection Bias OR=3.0 The best solution is to work toward high participation (>80%) in all groups.

Selection Bias in a Retrospective Cohort Study In a retrospective cohort study selection bias occurs if selection of exposed & non-exposed subjects is somehow related to the outcome. What will be the result if the investigators are more likely to select an exposed person if they have the outcome of interest?

Selection Bias in a Retrospective Cohort Study Example: Investigating occupational exposure (an organic solvent) occurring 15-20 yrs. ago in a factory. Exposed & unexposed subjects are enrolled based on employment records, but some records were lost. Suppose there was a greater likelihood of retaining records of those who were exposed & got disease.

Selection Bias in a Retrospective Cohort Study Differential referral or diagnosis of subjects Exp. Y N Diseased Diseased Y N Y N 100 900 Y 99 720 50 950 Exp. N 40 760 True RR=2.0 20% of employee health records were lost or discarded, except in solvent workers who reported illness (1% loss). RR=2.42 Workers in the exposed group were more likely to be included if they had the outcome of interest.

The Healthy Worker Effect Can be considered a form of selection bias because the general population controls have a higher probability of getting the outcome (death). General Population vs. Rubber Workers The general population is often used in occupational studies of mortality, since data is readily available, and they are mostly unexposed. Mortality Rates? The main disadvantage is bias by the healthy worker effect. The employed work force (mostly healthy) generally has lower rates of mortality and disease than the general population (with healthy & ill people).

Comparing Mortality In: Additional risk in unemployed True RR=1.3 Apparent RR=1.1 More likely to have the outcome than the employed work force that produced the cases. Exposed Work Force Unexposed Work Force Overall Population

Differential Retention (Loss to Follow Up) in Prospective Cohort Studies Enrollment into a prospective cohort study will not be biased by the outcome, because the outcome has not occurred at enrollment. However, prospective cohort studies can have selection bias if the exposure groups have differential retention of subjects with the outcomes of interest. This can cause either an over- or under- estimate of association 0.3 1.0 2 3

OC Y N Selection Bias in a Prospective Cohort Study TE Y N 20 9980 10 9990 OC Y N More events lost in one exposure group TE Y N 8 5980 8 5990 True RR=2.0 Loss to Follow Up Bias RR=1.0

Differential loss to follow up in a prospective cohort study on oral contraceptives (OC) & thromboembolism (TE). If OC were associated with TE with RR=2.0 (TRUTH), the 2x2 for all subjects would look like this: Without TE Normal Losses OC+ 20 9,980 OC- 10 9,990 There is 40% loss to follow up overall, but a greater tendency to loose OC users with TE results in a de facto selection. Final Sample TE Normal OC+ 8 5,980 OC- 8 5,990 If OC users with TE are more likely to be lost than non-oc-users with TE (Biased) RR = (8/5988) = 1.0 (8/5998)

Observation Bias (Information Bias) Biased measure of association due to incorrect categorization. Diseased Not Diseased Exposed The Correct Classification Not Exposed

Misclassification Bias Subjects are misclassified with respect to their risk factor status or their outcome, i.e., errors in classification. Non-differential Misclassification (random): If errors are about the same in both groups, it tends to minimize any true difference between the groups (bias toward the null). Errors = Errors Differential Misclassification (non-random): If information is better in one group than another, the association maybe over- or underestimated. Errors Errors

Non-Differential Misclassification Errors = When errors in exposure or outcome status occur with approximately equal frequency in groups being compared. Errors 1. Equally inaccurate memory of exposures in both groups. Example: Case-control study of heart disease and past activity: difficulty remembering your specific exercise frequency, duration, intensity over many years 2. Recording and coding errors in records and databases. Example: ICD-9 codes in hospital discharge summaries. 3. Using surrogate measures of exposure: Example: Using prescriptions for anti-hypertensive medications as an indication of treatment 4. Non-specific or broad definitions of exposure or outcome. Example: Do you smoke? to define exposure to tobacco smoke (vs. How much, how often, how long).

Non-Differential Misclassification Errors = Errors Recording or Coding Errors: Example: When patients are discharged, the MD dictates a summary which is transcribed. Diagnoses and procedures noted on the summary are encoded (ICD-9 codes) and sent to the MA Health Data Consortium. 1. MDs don t list all relevant diagnoses. 2. Coders assign incorrect codes (they aren t MDs). Errors occur in 25-30% of records.

Non-Differential Misclassification Errors = Errors Example: A case-control study comparing CAD cases & controls for history of diabetes. Only half of the diabetics are correctly recorded as such in cases and controls. True Relationship CAD Controls Diabetes 40 10 No diabetes 60 90 OR= 40x90 = 6.0 10x60 With Nondifferential Misclassification CAD Controls Diabetes 20 5 No diabetes 80 95 OR= 20x95 = 4.75 5x80

Non-Differential Misclassification Errors = Errors Effect: With a dichotomous exposure (e.g., smoking vs. non-smoking), non-differential misclassification minimizes differences & causes an underestimate of effect, i.e. bias toward the null. Null means no difference 0.3 0.5 1.0 2 3 Relative Risk

Validation When data from the Nurses Health Study was used to examine the association between obesity and heart disease, information on exposure (BMI) was obtained from self-reported weights on a questionnaire. Could they have under-reported? Self-reported weights were validated in a subsample of 184 NHS participants living in the Boston, MA area and were highly correlated with actual measured weights (r = 0.96). Cho E, Manson JE, et al.: A Prospective Study of Obesity and Risk of Coronary Heart Disease Among Diabetic Women. Diabetes Care 25:1142 1148, 2002.

Differential Misclassification When errors in classification of exposure or outcome are more frequent in one group. Errors Errors Differences in accurately remembering exposures (unequal) Example: Mothers of children with birth defects will remember the drugs they took during pregnancy better than mothers of normal children (maternal recall bias). Interviewer or recorder bias. Example: Interview has subconscious belief about the hypothesis. More accurate information in one of the groups. Example: Case-control study with cases from one facility and controls from another with differences in record keeping.

Errors Errors Recall Bias (Differential) Note: If the groups have the same % of errors based on faulty memory, that s non-differential misclassification. People with disease may remember exposures differently (more or less accurately) than those without disease. To Minimize: Use a control group that has a different disease (unrelated to the disease under study). Use questionnaires that are constructed to maximize accuracy and completeness. Ask specific questions. More accuracy means fewer differences. For socially sensitive questions, such as alcohol and drug use or sexual behaviors, use a self-administered questionnaire instead of an interviewer. If possible, assess past exposures from biomarkers or from pre-existing records.

(Differential) Errors Errors Interviewer Bias (& recorder bias in record reviews) Systematic differences in soliciting, recording, or interpreting information. Are you sure? Think harder! Minimized by: Blinding the interviewers if possible. Using standardized questionnaires consisting of closed-end, easy to understand questions with appropriate response options. Training all interviewers to adhere to the question and answer format strictly, with the same degree of questioning for both cases and controls. Obtaining data or verifying data by examining preexisting records (e.g., medical records or employment records) or assessing biomarkers. Interviewer

Effects of Bias Non-Differential Misclassification Errors Errors Errors Errors 0.3 1.0 2 3 0.3 1.0 2 3 Differential Misclassification Interviewer bias Recall Bias Bias to Null These are differential and can bias toward or away from null.

Misclassification of Outcome Can Also Introduce Bias but it usually has much less of an impact than misclassification of exposure, because: 1. Most of the problems with misclassification occur with respect to exposure status, not outcome. 2. There are a number of mechanisms by which misclassification of exposure can be introduced, but most outcomes are more definitive and there are few mechanisms that introduce errors in outcome. 3. Most outcomes are relatively uncommon. 4. Misclassification of outcome will generally bias toward the null, so if an association is demonstrated, if anything the true effect might be slightly greater.

A study is conducted to see if serum cholesterol screening reduces the rate of heart attacks. 1,500 members of an HMO are offered the opportunity to participate in the screening program, & 600 volunteer to be screened. Their rates of MI are compared to those of randomly selected members who were not invited to be screened. After 3 years of follow-up rates of MI are found to be significantly less in the screened group. Any concerns? 1. Nope 2. Differential misclassification 3. Interviewer bias 4. Recall bias 5. Selection bias

Background Information on Abdominal Aortic Aneurysms Abdominal Aortic Aneurysm (AAA)

Diagnosis of Abdominal Aortic Aneurysms Usually asymptomatic (surgery if > 5 cm.) Discovered during routine abdominal exam by palpation, or Seen on x-ray or ultrasound of abdomen (done for other reasons). Known risk factors: Age Male gender Smoking Hypertension

Costa & Robbs: Br. J. Surg. 1986 A vascular surgery (referral) service in So. Africa reviewed records of elective peripheral vascular surgery. Black White AAA Other 60 1,242 1,302 260 620 880 320 1,862 Other : a variety of readily apparent conditions. OR = 0.12 (0.09 0.15) Conclusion: AAA uncommon in Blacks and more often due to infections.

Costa & Robbs: Br. J. Surg. 1986 A vascular surgery (referral) service in So. Africa reviewed records of elective peripheral vascular surgery. Black AAA Other 60 1,242 1,302 Other : a variety of readily apparent conditions. White 260 620 880 320 1,862 Could there have been selection bias? 1. Yes 2. No

All black patients were screened for TB and for syphilis. Blacks Whites Atherosclerotic 34% 99% Inflammatory or Infectious 47% 0.5% Uncertain etiology 19% 0. 0% AAA in blacks are more often due to infectious causes. A possibility of misclassification? 1. No 2. Yes, random. 3. Yes, differential.

More Details About the Study White Black Male:Female 2:1 1:1 Mean age 49.4 67.1 Admitted for Uncontrolled HBP 0% 17% Smoking 76% 48% (Known risk factors ) Age Male gender Smoking Hypertension

Environmental tobacco smoke and tobacco related mortality in a prospective study of Californians, 1960-98. James E. Enstrom, Geoffrey C. Kabat. BMJ 2003;326:1057 118,094 adults enrolled in an ACS cancer study in 1959 were followed until 1998. For never smokers married to ever smokers compared with never smokers married to never smokers : RR in Males RR in Females Heart disease 0.94 (0.85-1.05) 1.01 (0.94-1.08) Lung cancer 0.75 (0.42-1.35) 0.99 (0.72-1.37) Chr. Pulm. Dis. 1.27 (0.78-2.08) 1.13 (0.80-1.58) Conclusions: The results do not support a causal relation between environmental tobacco smoke and tobacco related mortality, although they do not rule out a small effect.

Any potential selection bias in the ETS study? 1. I don t think so. 2. Yes, there was a potential for it.

Any potential information bias in the ETS study? 1. I don t think so. 2. Non-differential misclassification. 3. Differential misclassification. 4. Interviewer bias. 5. Recall bias.

Are Analgesic Drugs Associated with Increased Risk of Renal Failure? Case-Control study by Perneger et al. Cases found in the renal dialysis registry for all of Maryland, Virginia, West Virginia, & D.C. Controls: random digit dial from the same geographic area. Data: Non-blinded phone interview asking about lifetime analgesic use.

Case-Control Study: Analgesic Use & Renal Failure Conclusion: Acetaminophen & NSAIDS increase risk of renal failure, but not aspirin. Could any biases have influenced the conclusion? OR 95% CI Acetaminophen 0-999 1.0-1000-4999 2.0 1.3-3.2 >5000 2.4 1.2-4.8 Aspirin 0-999 1.0-1000-4999 0.5 0.4-0.7 >5000 1.0 0.6-1.8 NSAIDs 0-999 1.0-1000-4999 0.6 0.3-1.1 >5000 8.8 1.1-71.8

This was not a terrible paper. However, is it possible that there were any potentially important sources of bias? Think in a structured way. Think about how they collected the data. Consider 0% each of the following: 0% Could there have been selection bias? Was loss to follow-up a problem? Could interviewer bias have affected results? Could recall bias have affected results? Is it possible that the conclusions were influenced by non-differential misclassification?

Avoiding Bias Once it s in the study, you can t fix it. Select subjects by similar mechanism. Maintain follow-up in prospective studies. Blind interviewers. For case-control studies, get subjects with equal tendency to remember. Use clear, homogeneous definitions of disease & exposure. Get accurate data collected in a similar way. Confirm data; use error trapping during data entry.