BMI 541/699 Lecture 16

Size: px

Start display at page:

Download "BMI 541/699 Lecture 16"

Allen Hopkins
6 years ago
Views:

1 BMI 541/699 Lecture 16 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Proportions & contingency tables - hypothesis test of a proportion - estimate and confidence interval for a proportion - χ 2 (Chi-square) goodness of fit test. - contingency tables (testing two proportions). - Mcnemar s test for paired binary data - odds ratios and relative risk - sensitivity and specificity - positive predictive value and negative predictive value - ROC curves 1 / 38

2 Relative Risk & Odd ratios Example An experiment is conducted to compare the proportion of low birth weight babies in mothers who smoke to those who don t. n 1 = 105 babies from smoking mothers are weighed at birth n 2 = 87 babies from non-smoking mothers are weighed at birth X 1 = number of low birth weight babies from mothers who smoke X 1 Binomial(n 1, p 1 ) X 2 = number of low birth weight babies from mothers who don t smoke We observe x 1 = 21 and x 2 = 9 X 2 Binomial(n 2, p 2 ) Mother Outcome smokes doesn t smoke low birth weight x 1 = 21 x 2 = 9 normal birth weight n 1 x 1 = 84 n 2 x 2 = 78 Total n 1 = 105 n 2 = 87 2 / 38

3 Mother Outcome smokes doesn t smoke low birth weight 21 9 normal birth weight Total Note that here the column sums are fixed by the experimental design but the row totals are not. ˆp 1 = 21/105 = ˆp 2 = 9/87 = We could estimate the difference between p 1 and p 2 but sometimes we are interested in estimating the ratio of proportions rather than the difference. 3 / 38

4 The Relative Risk (RR) is a ratio of two probabilities, both of the same event, but under different conditions. The relative risk of low birth weight for mothers who smoke relative to those who don t is RR = Pr(low birth weight mother smoked) Pr(low birth weight mother didn t smoke) RR = ˆp 1 /ˆp 2 =.2000/.1034 = So the risk of low birth weight babies is times as high for smoking mothers as for non-smoking. 4 / 38

5 The Odds Ratio The odds ratio is another way to compare probabilities. If E 1 = is the event that a low birth weight baby is born to a mother who smokes and Pr(E 1 ) = p 1. then the odds of event E 1 = Pr(E 1) Pr(!E 1 ) = Pr(E 1) 1 Pr(E 1 ) = p 1 1 p 1 If E 2 = the event that a low birth weight baby is born to a mother who doesn t smoke and Pr(E 2 ) = p 2 then the odds of event E 2 = Pr(E 2) Pr(!E 2 ) = p 2 1 p 2 5 / 38

6 The odds ratio of two events is the ratio of the odds for the events and is often denoted θ. The odds ratio for events with probabilities p 1 and p 2 is θ = p 1/(1 p 1 ) p 2 /(1 p 2 ) = p 1(1 p 2 ) p 2 (1 p 1 ) For our example ˆp 1 = 21/105 = ˆp 2 = 9/87 = and the estimate of the odds ratio for low birth weight babies born to smokers vs non smokers is: θ = ˆp 1(1 ˆp 2 ) ( ) = ˆp 2 (1 ˆp 1 ) ( ) = / 38

7 Comparing Relative Risk and Odds Ratios Relative risk and odds ratios are not identical, but are similar to one another. The exact relationship is: odds ratio = p 1 p 2 1 p 2 1 p 1 = relative risk 1 p 2 1 p 1 The odds ratio will be very close to the relative risk when p 1 and p 2 are both small. 7 / 38

8 The relative risk p 1 /p 2 is easier to interpret than the odds ratio. So why do we use the odds ratio? 1. The range of the RR is dependent on p 2. 0 RR 1/p 2 The range of the OR is always 0 to. 0 OR 2. There are situations (case-control studies) where an odds ratio can be calculated from data where relative risk cannot. 3. Logistic regression allows us to model the probability of an event as a function of multiple variables. The results can be summarized as odds ratios. 8 / 38

9 OR for case-control studies Turn the birth weight experiment around Let For the original experiment we sampled smoking mothers and non-smoking mothers Imagine a second experiment where we sample underweight babies and a matching control group of normal weight babies. (a case control study) n 3 = 30 low birth weight babies n 4 = 162 matched normal birth weight babies X 3 = # of mother s who smoke in the low birth weight baby group X 3 Binomial(n 3, p 3 ) X 4 = # of mother s who smoke in the normal birth weight baby group X 4 Binomial(n 4, p 4 ) 9 / 38

10 We observe x 3 = 21 and x 4 = 84 Mother Group smokes doesn t smoke total low birth weight x 3 = 21 9 n 3 = 30 normal birth weight x 4 = n 4 = 162 ˆp 3 = 21/30 = 0.70 ˆp 4 = 84/162 = The data from a case control study do not give us any information about the values of p 1 = the probability of a low birth weight baby for mothers who smoke. OR p 2 = the probability of a low birth weight baby for mothers who don t smoke. We also can t estimate p 1 /p 2 (the relative risk). 10 / 38

11 However we can estimate the odds ratio from a case control study since the odds ratio for the rows is the same as the odds ratio for the columns. Odds ratio for the rows is (odds ratio of smoking mothers for the two groups of babies) θ = ˆp 3(1 ˆp 4 ) 0.70( ) = (1 ˆp 3 )ˆp 4 (1 0.70)0.518 = Our estimate for the column odds ratio (odds ratio of low birth weight babies for the two groups of mothers) θ = ˆp 1(1 ˆp 2 ) ( ) = ˆp 2 (1 ˆp 1 ) ( ) = We get the same thing even though we can t estimate p 1 or p / 38

12 The distribution of the Relative Risk An estimate for RR is RR = ˆp 1 /ˆp 2 where ˆp i = x i /n i The distribution of log( RR) is approximately normal with mean = log(rr) variance = 1 p 1 n 1 p p 2 n 2 p 2 standard deviation = 1 p 1 n 1 p p 2 n 2 p 2 12 / 38

13 A confidence interval for the relative risk A 95% CI for log(rr): log( RR) ± ˆp 1 n 1 ˆp ˆp 2 n 2 ˆp 2 transform back to get corresponding CI for RR. 13 / 38

14 Odds ratio: Recall the odds ratio is the ratio of the odds in the two groups. OR = p 1/(1 p 1 ) p 2 /(1 p 2 ) Note that ˆp 1 (1 ˆp 1 ) = x 1/n 1 1 x 1 /n 1 = x 1 n 1 x 1 The estimate for the OR is ÔR = ˆp 1/(1 ˆp 1 ) ˆp 2 /(1 ˆp 2 ) = x 1/(n 1 x 1 ) x 2 /(n 2 x 2 ) = x 1(n 2 x 2 ) x 2 (n 1 x 1 ) 14 / 38

15 Distribution of the Odds Ratio log(ôr) is approximately normally distributed. The expected value of log(ôr) is mean(log(ôr)) = log(or) The standard deviation of log(ôr) Var(log(ÔR)) = ( )) ( ( )) (log Var ˆp1 ˆp2 + Var log 1 ˆp 1 1 ˆp 2 = 1 n 1 p 1 (1 p 1 ) + 1 n 2 p 2 (1 p 2 ) = n 1 p 1 n 1 (1 p 1 ) n 2 p 2 n 2 (1 p 2 ) 15 / 38

16 If we estimate p 1 and p 2 using ˆp 1 = x 1 n 1 and ˆp 2 = x 2 we get: Var(log(ÔR)) = 1 x 1 + a 95% CI for log(or) is log(ôr) ± x n 1 x x n 1 x x 2 + n 2, 1 n 2 x 2 1 n 2 x 2 apply the inverse log to the endpoints of the CI to obtain the CI for OR. 16 / 38

17 Example: An experiment is conducted to compare the effectiveness of two anti-hypertensive medications. n 1 = 105 subjects are given drug 1 n 2 = 87 subjects are given drug 2. X 1 = number who improve on drug 1 Binomial(n 1, p 1 ) X 2 = number who improve on drug 2 Binomial(n 2, p 2 ) We observe x 1 = 71 and x 2 = 45 treatment group outcome drug 1 drug 2 improvement no improvement Total ˆp 1 = 71/( ) =.676 ˆp 2 = 45/( ) = / 38

18 treatment group outcome drug 1 drug 2 improvement a = 71 c = 45 no improvement b = 34 d = 42 ÔR = 71(42) 45(34) = 1.95 The 99% CI for log(or) is 1 log(1.95) ± = (.103, 1.438) 45 And the 99% CI for the odds ratio OR is ( e.103, e 1.438) = (0.90, 4.21) 18 / 38

19 For the example the estimate for relative risk is RR = ˆp 1 /ˆp 2 =.676/.517 = 1.31 and a 99% CI for log(rr) is.324 log(1.31) ± (.676) = (.051,.587) 87(.517) 99% CI for RR: (e.051, e.587 ) = (0.95, 1.80) 19 / 38

20 Summary of relative risk & odds ratio relative risk = (p 1 /p 2 ) odds ratio = p 1/(1 p 1 ) p 2 /(1 p 2 ) RR is easier to interpret but OR is used more often because: 1. The range of the RR is dependent on p 2. The range of the OR is always 0 to. 2. There are situations (case-control studies) where the odds ratio can be calculated from data but the relative risk cannot. 3. Logistic regression produces odds ratios Our estimate for the odds ratio is ÔR = ˆp 1(1 ˆp 2 ) (1 ˆp 1 )ˆp 2 where ˆp 1 = x 1 /n 1 ; ˆp 2 = x 2 /n 2 ; The 95% confidence interval for the log odds ratio is log(ôr) ± x n 1 x x n 2 x 2 transformed back to the OR scale using the inverse log. 20 / 38

21 Diagnostic Tests and 2 by 2 tables Example: Angiograms (the gold standard test for diagnosing stroke) have a slight risk of mortality (< 1%). Some investigators have attempted to use a PET scanner to detect stroke non-invasively. A simple random sample of n = 64 was taken from the group of patients who were thought to have had a stroke. Stroke (by angiogram) PET Frequency negative negative 21 negative positive 8 positive negative 3 positive positive / 38

22 The data in a 2 by 2 table: Stroke (by angiogram) positive negative Total PET test: positive negative Total There are a number of measures of the quality of the new diagnostic test. The sensitivity or true positive rate of a diagnostic test is the probability of a positive test result when the patient has the disease. That is, how sensitive is the test in detecting the disease. How likely is the test to be positive if you have the disease. Sensitivity = Pr(positive diagnostic test disease) 22 / 38

23 Recall: For any events E and F, Pr(E F ) = Pr(E&F )/Pr(F ) So: Sensitivity = Pr(positive diagnostic test disease) Pr(positive PET & had stroke) = Pr(had stroke) Data again: Stroke (by angiogram) positive negative Total PET test: positive negative Total Our best estimate for Pr(positive PET & had stroke) = 32/64. Our best estimate for Pr(had stroke) = 35/64. Sensitivity = 32/64 35/64 = 32/35 = 91% 23 / 38

24 Stroke (by angiogram) positive negative Total PET test: positive negative Total Another measure of the quality of a diagnostic test is the specificity or true negative rate. Specificity is the probability of a negative test result when the patient doesn t have the disease. That is, how specific is the test in detecting the disease. How likely is the test to be negative if you don t have the disease. Specificity = Pr(negative diagnostic test no disease) 24 / 38

25 Stroke (by angiogram) positive negative Total PET test: positive negative Total Specificity = Pr(negative PET didn t have stroke) = Pr(negative PET & no stroke)/pr(no stroke) Our best estimate for Pr(negative PET & no stroke) = 21/64. Our best estimate for Pr(no stroke) = 29/64. So Specificity = 21/64 = 21/29 = 72% 29/64 25 / 38

26 If we name the cells in the table: have don t have disease disease total test is positive TP = true positives FP = false positives total positives test is negative FN = false negatives TN = true negatives total negatives total total with total without disease disease Our estimates are: Sensitivity = true positive rate = number of true positives / total with disease = TP/(TP+FN) Specificity = true negative rate = number of true negatives / total without disease = TN/(TN+FP) 26 / 38

27 HIV test example: HIV present by Blood test positive negative Home test positive Home test negative Sensitivity = 72/(72+2) = 94%; Specificity = 157/(157+50) = 76% This could be a good screening test. There is only a 6% chance of a false negative when HIV is present (1-sensitivity). 27 / 38

28 Sensitivity/Specificity trade off For many diagnostic tests there is a cutoff value. The test is positive if the measurement is above the cutoff negative if the measurement is below the cutoff If the cutoff level is set too high there will be few false positives but many false negatives. Such a test would miss many individuals with the disease. (low sensitivity) If the cutoff level is set too low there will be few false negatives but many false positives. Such a test would indicate that many healthy individuals have the disease (low specificity). A good diagnostic test will discriminate well between those with and without the disease and a cutoff can be found that provides good sensitivity and good specificity. 28 / 38

29 Use in the clinical setting From a perspective of putting the diagnostic test to use in a clinical setting the predictive value measures can be more useful: The positive predictive value (PPV) of a diagnostic test is the probability of the patient having the disease given the diagnostic test is positive PPV = Pr(disease positive diagnostic test) The negative predictive value (NPV) of a diagnostic test is the probability of the patient not having the disease given the diagnostic test is negative NPV = Pr( no disease negative diagnostic test ) 29 / 38

30 Example: Pet stroke test We can calculate PPV and NPV from the same 2 by 2 table: Stroke (by angiogram) positive negative Total PET test: positive negative Total Positive predictive value = Pr(had stroke positive PET) = Pr(had stroke & positive PET)/Pr( positive PET) PPV = 32/64 = 32/40 = 80% 40/64 Negative predictive value = Pr(didn t have stroke negative PET) = Pr(no stroke & negative PET)/Pr(negative PET) NPV = 21/64 24/64 = 21/24 = 87% 30 / 38

31 Summary of calculations have don t have disease disease total test is positive TP = true positives FP = false positives total positives test is negative FN = false negatives TN = true negatives total negatives total total with total without disease disease Our estimates are: Sensitivity = true positive rate = number of true positives / total with disease = TP/(TP+FN) Specificity = true negative rate = number of true negatives / total without disease = TN/(TN+FP) Positive Predictive Value (PPV) = number of true positives / total positives = TP/(TP+FP) Negative Predictive Value (NPV) = number of true negatives / total negatives = TN/(TN+FN) 31 / 38

32 One sample vs. two PPV & NPV change when the probability of disease changes. Sensitivity & Specificity don t. If I increase the total number of negative angiograms without changing the proportion of positive and negative PET tests, PPV will go down and NPV will go up. Stroke (by Angiogram) Positive Negative Total PET test: positive negative Total PPV is now 32/112 = 29% (was 80%) NPP is now 210/213 = 99% (was 87%) 32 / 38

33 Sensitivity and specificity do not change because they condition on disease status. Stroke (by Angiogram) Positive Negative Total PET test: positive negative Total Sensitivity = 32/35 = 91% (same as before) Specificity = 210/290 = 72% (same as before) Sensitivity and Specificity can be estimated from two independent samples, one of disease positive individuals and one of disease negative individuals. PPV and NPV can only be estimated when the table is from a simple random sample of the population of interest (patients who may have had a stroke). 33 / 38

34 Comments on Diagnostic Testing A diagnostic test with low PPV may still be useful as an initial screening tool to be followed up with a test with higher PPV. PPV and NPV can only be estimated when the table is from a simple random sample of the population of interest. Sensitivity and Specificity can be estimated from two independent samples, one of individuals with the disease and one of individuals without the disease. 34 / 38

35 Sensitivity and Specificity trade off: For a diagnostic test based on a continuous variable, - Decreasing the cutoff point will increase sensitivity and decrease specificity. - Increasing the cutoff point will decrease sensitivity and increase specificity. A high false positive rate (low specificity) can be alarming and costly. Example: false positive mammography findings leads to unnecessary biopsy. Some diagnostic testing strategies aim to maximize sensitivity (minimize false negatives) at a cost of false positives in order to rule out disease with the highest possible probability. Example: screening of donated blood for Hepatitis. 35 / 38

36 Sensitivity, Specificity, PPV & NPV in R Commander. You will need the Rcmdr plugin EZR. To load EZR Install the package RcmdrPlugin.EZR into R if you have not already done so. You only need to do this once. See the instructions on the home page. Load EZR into R (must do this each time you need EZR): - Choose the Menu Tools Load Rcmdr plug-ins(s)... - Choose EZR and say yes when you are asked if you want to restart R commander. 36 / 38

37 Example: Stroke (by angiogram) positive negative Total PET test: positive negative Total R commander (EZR) menu: Statistical Analysis Accuracy of diagnostic test Accuracy of qualitative test. Enter the table counts and click OK. 37 / 38

38 The output: > epi.tests(.table, conf.level = 0.95) Disease positive Disease negative Total Test positive Test negative Total Point estimates and 95 % CIs: Estimation Lower CI Upper CI Apparent prevalence True prevalence Sensitivity Specificity Positive predictive value Negative predictive value Diagnstic accuracy Likelihood ratio of a positive test Likelihood ratio of a negative test / 38

Screening (Diagnostic Tests) Shaker Salarilak

Screening (Diagnostic Tests) Shaker Salarilak Outline Screening basics Evaluation of screening programs Where we are? Definition of screening? Whether it is always beneficial? Types of bias in screening?