Epidemiologic Methods for Evaluating Screening Programs. Rosa M. Crum, MD, MHS Johns Hopkins University

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site. Copyright 2008, The Johns Hopkins University and Rosa M. Crum All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided AS IS ; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for obtaining permissions for use from third parties as needed.

Epidemiologic Methods for Evaluating Screening Programs Rosa M. Crum, MD, MHS Johns Hopkins University

Learning Objectives Discuss general principles guiding the introduction of screening programs Describe general principles for evaluating screening programs Define biases that may affect the apparent outcome of screening programs Explain the principles of early detection 3

Section A Why Is Screening So Controversial?

Why Is Screening So Controversial? The public expects screening to work Screening costs may be high; follow-up may involve risks The evidence for many screening modalities is mixed Expert panels often give differing views 5

Early Lung Cancer Action Project Enrolled 1,000 volunteers Symptom free Aged 60 or older At least 10 pack-years of cigarette smoking No previous cancer Purpose: to evaluate baseline and annual low-dose CT screening in people at high risk for lung cancer Source: Henschke et al. (1999). Lancet, 354:99 105. 6

Frequency of Detection of NCNs Frequency of Detection of NCNs on Baseline Chest Radiography and Low-dose CT Number of participants (n=1,000) Result on low-dose CT 1 NCN 2 3 NCN 4 6 NCN Negative Total Result on chest cardiograph 1 NCN 17 9 4 28 58 2 3 NCN 1 2 0 7 10 4 6 NCN 0 0 0 0 0 Negative 141 48 11 732 932 Total 159 59 15 767 1,000 NCN non-calcified nodules Source: Henschke et al. (1999). Lancet, 354:99 105. 7

Distribution of Tumors by Size and Stage Stage NCN 2 5 mm NCN 6 10 mm NCN 11 2- mm NCN 21 45 mm Total IA 1 12 (1) 8 (2) 1 (1) 22 (4) IB 0 0 0 1 1 IIA 0 1 (1) 0 0 1 (1) IIB 0 0 0 0 0 IIIA 0 0 0 2 (2) 2 (2) IIIB 0 1 0 0 1 Total 1 (0) 14 (2) 8 (2) 4 (3) 27 (7) NCN non-calcified nodules * Size on low-dose CT (numbers in parentheses are those by chest radiography) Source: Henschke et al. (1999). Lancet, 354:99 105. 8

CT Screening for Lung Cancer Source: Swensen et al.(2005). Radiology, 235:259 265. 9

CT Screening for Lung Cancer False-Positive Rates for Lung Cancer Nodule type Yes No Presence of prevalence cancers False-positive rate (%) Yes Presence of incidence cancers No False-positive rate (%) All nodules 31 749 96.0 32 773 96.0 Nodules > 4mm 31 404 92.9 31 378 92.4 Source: Swensen et al.(2005). Radiology, 235:259 265. 10

No Support for Lung Cancer Screening The United States Preventive Services Task Force last year concluded that current data do not support screening for lung cancer with any method Source: J. Brody. (August 16, 2005.) What an Extra Eye on Cancer Can Do for You. The New York Times. 11

No Support for Lung Cancer Screening Source: By S. Woloshin, L. Schwartz and H. G. Welch. (August 22, 2005.) Warned, But Worse Off. The New York Times. 12

No Support for Lung Cancer Screening Conclusions Screening for lung cancer may not meaningfully reduce the risk of advanced lung cancer or death from lung cancer. Source: Bach et al. (2007). JAMA, 297:953 61. 13

When Should a Screening Program Be Introduced? Test characteristics adequate Validity Reliability Predictive value Costs sufficiently low Evidence for benefit Mechanisms available for testing and follow-up 14

Screening Programs and Operational Measures Assessing the effectiveness of screening programs using operational measures 1. Number of people screened 2. Proportion of target population screened and number of times screened 3. Detected prevalence of preclinical disease 4. Total costs of the program 5. Costs per case found 6. Costs per previously unknown case found 7. Proportion of positive screenees brought to final diagnosis and treatment 8. Predictive value of a positive test in population screened Source: adapted from Hulka. (1998). 15

Screening Programs and Outcome Measures Assessing the effectiveness of screening programs using outcome measures 1. Reduction of mortality in the population screened 2. Reduction of case-fatality in screened individuals 3. Increase in percent of cases detected at earlier stages 4. Reduction in complications 5. Prevention of/reduction in recurrences or metastases 6. Improvement of quality of life in screened individuals 16

Problem of Establishing Sensitivity and Specificity Problem of establishing sensitivity and specificity because of limited follow-up of those who test negative Disease + Test + a c b d Further testing 17

Problem of Establishing Sensitivity and Specificity Problem of establishing sensitivity and specificity because of limited follow-up of those who test negative HIV + ELISA + a c b d Western blot Example: ELISA and HIV 20

Problem of Establishing Sensitivity and Specificity Problem of establishing sensitivity and specificity because of limited follow-up of those who test negative Prostate cancer + + a b TRU PSA c d Example: PSA and prostate cancer 21

Section B Natural History of Disease

Natural History of Disease 23

Natural History of Disease 24

Natural History of Disease 25

Natural History of Disease 26

Natural History of Disease 27

Critical Point A critical point is a point in the natural history of the disease before which therapy may be less difficult and/or more effective; or a lesser therapy may be needed If the disease is potentially curable, cure may be possible before this point but not after Or, there may not be a critical point 28

Multiple Critical Points in the Natural History of Disease 29

Possible Outcomes of a Screening Program 30

Trading off: Benefits versus Risks Benefits need to be assessed in context of potential risks of screening Risks include: False-negatives Potential harm of delayed diagnosis False-positives Risks and costs of work-up Potential harm of false-positive diagnosis 31

Section C Problems that Complicate Assessing Survival Improvement as a Result of Early Detection

Problems that Complicate Assessing Survival Improvement 1. Selection bias a. Referral (volunteer) bias 33

BCDDP Screening Age-Specific Incidence Rates and Relative Risks for White Females in Years 2 and 3 Following the Initial BCDDP Screening Age in Years BCDDP Rate SEER Rate Relative Risk 20 24 2.7 1.3 2.10 25 29 16.8 8.0 2.10 30 34 60.3 28.8 2.10 35 39 114.6 54.7 2.10 40 44 203.7 109.2 1.87 45 49 280.8 173.3 1.62 50 54 320.9 198.8 1.61 55 59 293.8 221.5 1.33 60 64 369.4 278.3 1.33 65 69 356.1 315.3 1.13 70 74 307.8 331.3 0.93 34

Problems that Complicate Assessing Survival Improvement 1. Selection bias a. Referral (volunteer) bias b. Length-biased sampling 35

Natural History of Disease 36

Natural History of Disease 37

Natural History of Disease 38

Natural History of Disease 39

Natural History of Disease 40

Natural History of Disease 41

Natural History of Disease 42

Natural History of Disease 43

Lead Time Lead time is the interval by which the time of diagnosis can be advanced by the screening procedure, as compared with the normal methods for detection and diagnosis 44

Problems that Complicate Assessing Survival Improvement 1. Selection bias a. Referral (volunteer) bias b. Length-biased sampling 2. Lead-time bias 45

Five-Year Survival and Lead-Time Bias 46

Five-Year Survival and Lead-Time Bias 47

Five-Year Survival and Lead-Time Bias 48

Five-Year Survival: Diagnosis Made Without Screening 49

Shift of Five-Year Period by Screening and Early Detection 50

Lead-Time Bias 51

Problems that Complicate Assessing Survival Improvement 1. Selection bias a. Referral (volunteer) bias b. Length-biased sampling 2. Lead-time bias 3. Over-diagnosis bias 52

Relationship of Improved Outcome to Early Detection Two assumptions underlying a relationship of improved outcome to early detection of disease 1. All or most clinical cases of a disease first go through a detectable preclinical phase 2. In the absence of intervention, all or most cases in a preclinical phase progress to a clinical phase 53

Natural History of Cervical Cancer 54

Natural History of Cervical Cancer 55

Section D Studies Designs to Evaluate Screening

How Do We Evaluate Screening? Randomized clinical trials Randomization to new screening modality or to usual care Observational approaches Cohort study design Case-control studies Stage shift Trend in mortality 57

Design of a Non-Randomized Comparison (Cohort) Study The Benefits of Screening Screened Not screened Die from disease Don t die from disease Die from disease Don t die from disease 58

Design of a Randomized Trial The Benefits of Screening Eligible population R a n d o m i z e d Screened Not screened Die from disease Don t die from disease Die from disease Don t die from disease 59

HIP Study of Screening First breast cancer screening trial Women aged 40 64 enrolled in Health Insurance Plan (HIP) of Greater New York Started 1963 Followed participants for 18 years Sam Shapiro, professor emeritus, Health Policy and Management 60

HIP RCT on Periodic Screening for Breast Cancer The Benefits of Screening 62,000 women aged 40 64 years and enrolled in HIP R a n d o m i z e d 31,000 screened 31,000 not screened 39 die from breast cancer 30,961 don t die from breast cancer 63 die from breast cancer 30,937 don t die from breast cancer 61

HIP RCT on Periodic Screening for Breast Cancer Health Insurance Plan Data: Groups Sizes (Rounded), Deaths in Five Years of Follow-Up, and Death Rates per 1,000 Women Randomized Study Group size Breast cancer All other Number Rate Number Rate Screened 20,200 23 1.1 428 21 Refused 10,800 16 1.5 409 38 Total 31,000 39 1.3 837 27 Control 31,000 63 2.0 879 28 62

HIP RCT on Periodic Screening for Breast Cancer Die from breast cancer + Screened for breast cancer + 39 30,961 63 30,937 63

Deaths Due to Breast Cancer Deaths Due to Breast Cancer Five Years of Follow-Up after Entry to Study 64

Five-Year Case Fatality Rates Five-Year Case Fatality Rates among Breast Cancer Patients 65

Mortality Rates from all Causes Except Breast Cancer Mortality Rates from all Causes Except Breast Cancer per 10,000 Person-Years, HIP 66

Controversies in Breast Cancer Screening Does it work? The HIP and other screening trials? Whom to screen? Women under 50 years of age? The NIH Consensus Conference and beyond Does it work all over again? The Nordic systematic review So, what do we know? 67

Outrage 68

Mammography Mammography: RCTs for Women Under 50 as of 1996 (Science, 1997) Study Effect 95% C.I. HIP 0.77 0.53 1.11 Malmo 0.66 0.32 1.27 Two County 0.91 0.59 1.39 Edinburgh 0.73 0.43 1.25 Canada 1.10 0.78 1.54 Stockholm 1.05 0.53 2.09 Gottenburg 0.62 0.36 1.08 69

2000 Nordic Study: Meta-Analysis of Cochrane Library Results: screening for breast cancer with mammography is unjustifiable Data show that for every 1,000 women screened biennially throughout 12 years: One breast cancer death is avoided Total number of deaths increases by six Source: P. Gøtzsche and O. Olsen. (2000). Lancet, 355:129 134. 70

Dutch Study: Mammography and Breast Cancer Mortality 2003: assess effect of mammography screening program on breast cancer mortality Results: routine mammography screening can reduce breast cancer mortality rates in women aged 55 74 years Source: Otto et al. (2003). Lancet, 361:1411 1417. 71

Dutch Study: Mortality Rates 72

2003 Study: Two Swedish Counties 2003: assess long-term effect of screening on death from breast cancer in two Swedish counties Results: mammography screening is contributing to substantial reductions in breast cancer mortality in these two counties Source: Tabar et al. (2003). Lancet, 361:1405 1410. 73

Swedish Study: Cumulative Mortality Cumulative 20-Year Mortality from Incident Tumors Diagnosed in Women Aged 40 69 Years 74

Size of Tumors at Diagnosis 1987 2001 Mean and Median Size of Invasive Breast Cancer Tumors from 1987 2001 for Age Groups in Rhode Island 75

SEER Breast Cancer Incidence, 1973 2002 SEER Breast Cancer Incidence, 1973 2002 76

SEER Breast Cancer Mortality, 1969 2002 SEER Breast Cancer Incidence, 1969 2002 77

Case-Control Study: Screening Sigmoidoscopy Screening sigmoidoscopy and mortality from colorectal cancer Purpose: estimate the efficacy of screening sigmoidoscopy Cases 261 Kaiser Permanente members who died of colorectal cancer from 1971 88 Controls 868 subjects matched for age and sex Compared use of screening during 10 years before diagnosis for cases with use of screening in controls Source: (March 5, 1992). N Engl J Med, 326(10):653-7. 79

Design of a Case-Control Study of the Benefits of Screening Screened in the past Never screened Screened in the past Never screened Advanced disease Controls Cases "Controls" 80

Case-Control Study Case-control study for estimating efficacy of screening sigmoidoscopy 91.2% 8.8% had had not undergone undergone screening screening 24.2% had undergone screening 85.8% had not undergone screening 261 dead from colorectal cancer: 1971 1988 Cases 868 matched controls "Controls" 81

Case-Control Study Case-control study for estimating efficacy of screening sigmoidoscopy Cases Controls Underwent screening + 8.8 91.2 24.2 78.8 OR = (8.8)(75.8) =.30 (91.2)(24.2) Adjusted for confounders OR =.41 Source: (March 5, 1992). N Engl J Med, 326(10):653-7. 82

Ethical Issues in Screening for Disease Ethical issues in screening for disease 83