Biostatistics Lecture April 28, 2001 Nate Ritchey, Ph.D. Chair, Department of Mathematics and Statistics Youngstown State University 1. Some Questions a. If I flip a fair coin, what is the probability of getting heads? b. If you test positive for HIV, what is the probability of having HIV? c. If you have a positive mammogram, what is the probability that you have breast cancer? d. Given that you have breast cancer, what is the probability that you will live? e. Should the United States require widespread screening examinations for diseases like HIV, Breast Cancer, TB, Hepatitis, etc.? f. What is the cost of screening? How effective is it? 2. What is conditional probability? Definition: A B) A B) = B)
Why is this the definition? From the book, Medical Statistics page 30, and the 2x2 table from data of Weiner et al (1979) Exercise Tolerance and Coronary artery disease Present D Absent D- Total Positive T 815 115 930 Negative T- 208 327 535 Total 1023 442 1465 Prevalence in these patients D) =1023/1465 =.70 The probability of having the disease given that a person has a positive test is given by, D T ) 815/1465 815 P ( D T ) = = =.88 T ) 930/1465 930 This is sometimes referred to as the Predictive Value of a Positive Test. Homework: You calculate the Predictive Value of a Negative Test. Things to notice. P ( T ) = T D ) T D ) = 815/1465 115/1465 = 930/1465.63
Likewise, P ( D ) = D T ) D T ) = 815/1465 208/1465 = 1023/1465.7 You calculate the following: P ( D ) and P ( T ) Also calculate, P ( D T ) D ) T ) Sensitivity and Specificity Sensitivity: T D ) 815/1465 P ( T D ) = = = 815/1023.80 D ) 1023/1465 Specificity: T D ) 327/1465 P ( T D ) = = = 327/442.74 D ) 442/1465
Notice that sensitivity is not affected by disease prevalence. For example, if number of people with true coronary artery disease is tripled from 1023 to 3069 so that the new prevalence is now 3069/(14653069)=.68, then we should expect that three times as many patients would have a positive test. Thus 3x815 = 2445 should have a positive result. The sensitivity would be 2445/3069 =.80 Two very useful terms false negative rate = 1 - sensitivity Note: T D ) 208/1465 P ( T D ) = = = 208/1023.2 D ) 1023/1465 false positive rate = 1 - specificity T D ) 115/1465 P ( T D ) = = = 115/ 442.26 D ) 442/1465 For our problem, since the sensitivity is.8, the false negative rate is 1 -.8 =.2. Let's interpret this: 20% of the time a person will actually have the disease when the test says that he/she does not.
Likewise, since specificity is.74, the false positive rate is 0.36. You interpret! What about widespread or mandatory testing? What would the cost of this be???? How can we get the predictive value of a positive test from sensitivity. After all, at least for the patient, we want to know that probability that we have a disease given that we have just tested positive! In general, A B) = A B) B) and B A) = A B) A) Solving the second equation as follows, P ( A B) = B A) A) Substituting this in for the first equation, we have Bayes' Theorem A B) B) P ( B A) = A)
Applying this to our formula, we have T D ) D ) (.8)(1023/1465) P ( D T ) = = =.88 T ) 930/1465 In words, the predictive value of a positive test is equal to the (sensitivity) x (prevalence) divided by percentage who test positive. Let's illustrate Bayes' Theorem. Suppose that 84% of hypertensives and 23% of normotensives are classified as hypertensive by an automated blood-pressure machine. What is the predictive values of a positive test? What is the predictive value of a negative test? Sensitivity =.84 Specificity = 1-.23 =.77 Pretty Good Test! PV (.84)(.2) T ) = T D (.84)(.2) = ) T D ) = (.84)(.2) (.2)(.84) (.8)(.23) =.48 PV =.95
AIDs Testing Should we require widespread testing for AIDS? Using data from 2000 World Almanac there are approximately 125,000 men who have AIDS in the United States. The population of men in the United States is approximately 125,000,000. (it is closer to 133,000,000 but we will use this number since it makes for easier calculation) Thus, the prevalence of AIDS among men is P ( D ) = 125000/125000000 = 0.001 If we could develop a test which can detect AID's with a sensitivity of 100% (If you have it, then the test will tell you that you have it!) and a specificity of 95%. What is the probablity that a person has the disease, given a positive test result? P ( sensitivity)( prevalence) ) = T ) ( D T (1)(.001) = 1(.001) (.05)(.999) = 0.02 Let me explain the bottom: If you took 1000 people who have been tested, 1 person would have the disease and the test would indicate that the person would have the disease. By the specificity, 5 % of the other 999 would have an incorrect positive test.
If a person tests positive, then they usually get a second test. 1 1 0 D)=.001 D-)=.999 0.05.05.95.95 P (.001)(1)(1) T 1 T ) = = 0.2859 (.001)(1)(1) (.05)(.05)(.999) ( D 2
Cost of Widespread Testing. Let's say we, the State of Ohio, decide that we would like to identify all AID's patients so we undertake widespread testing. The population of Ohio is about 11,000,000. If we tested each male in the state at a very conservative cost of $20 per person. The cost would be $220,000,000/2 = 110,000,000 for the first test. Subsequent testing would follow. How many positives would we expect? 5,500,000*.001= 5500 How many negatives would we expect? 5,500,000-5500 = 5,494,500 How many false positives? 1 - specificity = 1 -.95=.05 NFP=.05(5,494,500)=274,725 How many false negatives? 0 Mass Hysteria!!
Cost of finding each AID's case C = Total Cost / Number of Actual Cases Total Cost = $20(5,500,000)$20(Second Tests) = $110,000,000 $20(280,225) = $115,604,500 Cost/Case=$115,494,500/5500 =$21,019 We have not tested the females either. Group Activity: If the number of females in Ohio is 5,500,000 and the Prevalence of AID's among females is: D)=.0002 Calculate the cost per patient of testing all females in the state. Combine the results to get an estimate of Total cost.
Breast Cancer Some Sensitivities and Specificities Self Exam Tumor Vol Sensitivity Specificity (Cubic cm) 0 0 75%.5 26% 1 50% 33.5 80% 33.5 80% Note: 33.5 cubic cm tumor is approximately 4cm in diameter Professional Exam Tumor Vol Sensitivity Specificity (Cubic cm) 0 0 80%.5 41% 1 65% 33.5 90% 33.5 90%
Mammography Tumor Vol Sensitivity Sensitivity Specificity (Cubic cm) (pre) (post) 0 0 0 93%.06 50% 68% >1 71% 89% Needle Biopsy (cancer specific): Sensitivity = 95%, Specificity = 95% Open Biopsy (cancer specific) Sensitivity = 100%, Specificity = 100% Breast Cancer Study Simulations A virtual cohort of 10,000 women were followed from birth to death. 1. Assumptions of the Model a. Each woman was 100% compliant to ACS Guidelines b. Positive exams lead to next higher level of exam c. Open Biopsy is completely accurate
2. Results (Average Number Per Patient) (10,000 Patients) Self Ex Pro. Ex. Mam Need.B. Open B. True.1.1.2.2.2 False 177.3 40.4 6.1.3 0 True - 500 169.5 64.7 5.8.3 False - 43.8 3.5 3.2 0 0 Total 721.2 213.5 74.2 6.3.5 Let s take a closer look at this this is amazing! 3. Study Results Stage 0: 222 16.6% Stage 1: 957 71.4% Stage 2A: 150 11.2% Stage 2B 9 0% Stage 3A 0 0% Stage 3B 0 0% Stage 4 0 0% Median Age at Diagnosis 65 yr Primary Tumor Diameter.58 cm Life Expectancy 80.51 yr BCA Survival Node Negative: 98.4% BCA Survival Node Positive: 82.8%
4. Look at the results under a different assumption of compliance. That is, assume that only 10% of the population is full compliant, 10% is 75% compliant, 20% is 20% compliant, and 60% is non-compliant. What happens to the results? (Average Number Per Patient) (10,000 Patients) Self Ex Pro. Ex. Mam Need.B. Open B. True.1.1.1.1.1 False 47.9 12.5 1.6.1 0 True - 147.1 47.1 20.8 1.6.1 False - 12.5 1.0.9 0 0 Total 207.5 60.7 23.5 1.8.2 This looks a lot different! 5. How about the results? Stage 0: 102 8.3% Stage 1: 550 44.9% Stage 2A: 315 25.7% Stage 2B 136 11.1% Stage 3A 48 3.9% Stage 3B 37 3.0% Stage 4 36 2.9%
Median Age at Diagnosis 66 yr Primary Tumor Diameter 1.23 cm Life Expectancy 79.55 yr BCA Survival Node Negative: 95.99% BCA Survival Node Positive: 55.25% 6. Recent Court Trial in Sharon, PA a. Jury Award of 12.8 Million Dollars b. Twice missed tumor by mammogram c. Effect on Patient d. Effect on Community e. Effect on Doctor 7. Conclusions