! Mainly going to ignore issues of correlation among tests

Size: px
Start display at page:

Download "! Mainly going to ignore issues of correlation among tests"

Transcription

1 2x2 and Stratum Specific Likelihood Ratio Approaches to Interpreting Diagnostic Tests. How Different Are They? Henry Glick and Seema Sonnad University of Pennsylvania Society for Medical Decision Making Short Course #8 Friday, October 21, QUESTIONS TO HELP US CALIBRATE YOUR EXPERIENCE 1. Are all the concepts in this figure familiar to everyone? OVERVIEW OF CLASS! In what situations does the 2x2 approach to interpreting diagnostic tests yield treatment decisions equivalent to those derived from the stratum-specific likelihood ratio approach? In what situations do they differ? - Introduce 2 modes of decision making that provide insight into the answers to these questions! When do the differences between the 2 approaches to interpreting tests matter?! How does one choose between 2 tests when using a 2x2 approach to interpret tests? When using a stratum-specific approach?! Mainly going to ignore issues of correlation among tests - Can be addressed, but complicates the story OUTLINE Worster A, et al. Diagnostic testing: an emergency medicine perspective. Canadian Journal of Emergency Medicine. 2002;4. 2. Are the concepts related to definition and calculation of positive and negative predictive value by use of stratum-specific likelihood ratios (SSLR) familiar to you? 3. What do we mean by a positive test? A negative test? Why do we use the cut-off we do? Module 1: Module 2. Module 3. Module 4. Module 5. Module 6: Using Test Characteristics to Calculate Posterior Probabilities of Disease Two Modes of Decision Making "One (Test) and Done" Decision Making Continuous Updating of Probabilities (House) Decision Making Summary 2x2 vs Stratum-specific Approaches Choice Between Tests 1 2

2 Readers who have followed the discussion [about likelihood ratios] to this point will understand the essentials of interpretation of diagnostic tests and can stop here. They should consider the next section, which deals with sensitivity and specificity, optional. We include it largely because clinicians will encounter studies that present their results in terms of sensitivity and specificity and may wish to understand this alternative framework for summarizing properties of diagnostic tests. Jaeschke, et al. JAMA. 1994;271:

3 MODULE 1: USING TEST CHARACTERISTICS TO CALCULATE POSTERIOR PROBABILITIES OF DISEASE! The diagnosis of disease is made on the basis of patient characteristics and the results of diagnostic tests - The diagnostic process is often sequential! The underlying rationale for ordering diagnostic tests is the anticipated gain in certainty from the additional testing - Increased certainty is important when false positive or false negative decisions have adverse consequences for the patients' health! The decrease in costs of these adverse consequences should be weighed against the increase in costs of additional testing, particularly when diagnostic tests are expensive or when they put a physical or emotional burden on patients! When several tests must be interpreted, Bayes theorem can be applied sequentially by using the posterior probability of one test as the prior probability for the next test COMMON APPROACHES! Many clinical evaluations and descriptions of diagnostic testing start a single sensitivity and specificity (or likelihood ratio positive and likelihood ratio negative) and report the resulting test accuracy! Some may recognize one can "trade-off" between sensitivity and specificity and use the ROC curve to depict this - Selecting sensitivity and specificity so as to maximize test accuracy still only includes information about test characteristics and prior probability of disease, and isn't enough for the best decision making to minimize costs! An alternative approach is to report stratum-specific likelihood ratios rather than sensitivities and specificities from multiple 2x2 tables GOAL! Our goal over the next three hours is to provide you with an appreciation for the alternative methods of incorporating diagnostic test results into clinical decision making and show you where using different methods may lead to conclusions that change patient care decisions. 5 6

4 THE DATA SIMULATED RESULTS, WBC FOR BACTEREMIA! Demonstrate these ideas by use of data on WBC for the diagnosis of bacteremia among children aged 3 to 36 months presenting with: - Rectal temperature >39 C - No obvious focal infection WBC 25 NNNNNN # NNNNNN - No toxic clinical appearance necessitating immediate hospitalization NNNN ## NNN - No specific viral infection, immune-deficiency condition, or chronic illness that would alter the standard approach to febrile illness 20 NNNNNNN ###### NNNNNNNNN 15 NNNNNNN ####### ####### NNNNNNNNNNNN 10 NN ######### ######### NNNNNNNNN Disease No Disease N = 1 observation; # = 20 observations. WBC = 1000/mm 3. While precise values may vary, the numbers of the 26 patients with disease and the 859 patients without disease falling within the 5 ranges match those in Jaffe DM, et al. Temperature and total white blood cell count as indicators of bacteremia. Pediatrics. 1991;87:

5 TEST RESULTS, WBC FOR BACTEREMIA Cut-off Children w/ Bacteremia Children w/o Bacteremia > > 20, < > 15, < > 10, < >0, < Total ! How many 2 2 tables can be developed from the N=5 strata? MULTIPLE 2x2 TABLES! How many 2 2 tables? - N + 1 total tables - N - 1 informative tables! Sensitivities and specificities Cut-off w/ Bact w/o Bact Sens 1-Spec Spec > > > > > > Total ! Multiple 2x2 tables summarized by use of the receiver operating characteristic (ROC) curve 9 10

6 CONSTRUCTION OF A ROC CURVE ROC CURVE, WBC FOR BACTEREMIA (Jaffe) >10 SENSITIVITY SENSITIVITY >20 >15 C.O. * SENS 1-SPEC 4 > > > > > > SPECIFICITY! Meaning of horizontal axis! Meaning of the vertical axis! Best operating point?! The 45< line! Plotting the operating points! Points below the 45< line > SPECIFICITY * x1000/mm 3 ; Cells >0 and >5 combined STRATUM-SPECIFIC LIKELIHOOD RATIOS AND CONTINUOUS TESTS! In the 2x2 approach to continuous data, we break the continuous data up into a series of cumulative 2x2 tables! Rather than cumulating the data, the SSLR approach calculates likelihood ratios for ranges of test results (i.e., strata)! Stratum-specific likelihood ratio: Fraction of diseased individuals with a test result (or a test result in a particular range [stratum]) divided by fraction of nondiseased individuals with that test result (or a test result in the same range) Proportion w/test result i D+ Proportion w/test result i D

7 LIKELIHOOD RATIOS. GRAPHICAL INTERPRETATION % Disease % 10% % 20% No Disease 60% -100 High Medium Low! LR High =! LR Mid =! LR Low = STEPS FOR CALCULATING STRATUM-SPECIFIC LIKELIHOOD RATIOS! Step 1: Establish strata and tabulate stratum-specific test results! Approach 1: - Step 2: Compute among diseased and among nondiseased the proportions with test results falling within the strata - Step 3: Compute stratum-specific likelihood ratios based on these proportions! Approach 2: - Using letters similar to the a, b, c, and d used for odds ratios ([axd]/[bxc]): SSLR i = (a i x f) / (b i x e) where a i = the number with disease with test result i; b i = the number without disease with test result i; f = total number of nondiseased; e = total number of diseased! Approach 3 (introduced because we use this notation in later proofs): SSLR i = (sens i - sens i-1 )/([1-spec i ]-[1-spec i-1 ]) 13 14

8 STEP 1: ESTABLISH STRATA AND TABULATE STRATUM-SPECIFIC TEST RESULTS Cut-off Bacteremia No Bacteremia > > 20, < > 15, < > 10, < < Total APPROACH 2. (a i x f) / (b i x e) Cut-off (AxF) / (BxE) SSLR 95% CI * > 25 (6x859) / (26x26) > 20,< 25 (4x859) / (43x26) > 15,< 20 (7x859) / (129x26) APPROACH 3. (sens i - sens i-1 )/([1-spec i ]-[1-spec i-1 ]) Cut-off Se i - Se i-1 1-Sp i - 1-Sp i-1 SSLR > / > 20,< / > 15,< / > 10,< / < / SSLR POSTERIOR PROBABILITY OF DISEASE! Common method for calculating posterior probability of disease uses the odds transformation! One can simplify the calculation by use of SSLR and probabilities without use of odds: SSLR prior i PPD i = ( SS LR i prior) + (1-prior) > 10,< 15 (7x859) / (292x26) < 10 (2x859) / (369x26) * Calculated by use of "ROC Analyzer" 15 16

9 SAMPLE CALCULATIONS! Assuming a prior probability of 2% Test Result < 10 ( 0.02 x 0.179) / [ (0.02 x 0.179) ] = Test Result >10, <15 ( 0.02 x 0.792) / [ (0.02 x 0.792) ] = Test Result >15, <20 ( 0.02 x 1.792) / [ (0.02 x 1.792) ] = Test Result >20, <25 ( 0.02 x 3.070) / [ (0.02 x 3.070) ] = Test Result >25 (0.02 x 7.617) / [ (0.02 x 7.617) ] = SUMMARY OF MODULE 1! The goal of diagnostic testing is to improve one's certainty about decisions to treat and withhold treatment! Continuously scaled diagnostic tests generally are interpreted by use of either optimal 2x2 tables or stratum-specific likelihood ratios! In 2x2 approach, a given test result yields a posterior probability that is higher or lower than one's prior probability, depending upon whether the optimal 2x2 table categorizes that test result as positive or negative! In the SSLR approach, a given test result yields a posterior that is either always higher or always lower than one's prior 17 18

10 MODULE 2. TWO MODES OF DECISION MAKING! What is the "receiver" that is referred to in "receiver operating characteristic curve"? ROC CURVES! ROC curves were developed in signal detection theory for determining optimal settings for the radar "receiver"! During Battle of Britain, if radar were set with too little sensitivity, the Luftwaffe would be overhead before the RAF fighters were scrambled! If they were set with too high a sensitivity, the RAF would be scrambled, but they'd only find flocks of seagulls flying over the English Channel! The ROC curve plotted trade-offs in sensitivity and specificity, and helped determine the optimal trade-off between the two DECISION MODE 1! Tell the story not so much for the history of signal detection theory, but instead to identify the nature of the decision problem: - Those who monitored the early warning system could make one grab at information and then had to make a decision to launch or not launch the fighters - What was important? * Was the expected value of launching the fighters greater than the expected value of not launching? * Restated: Was the cost of mistakes from launching greater or less than the cost of mistakes from not launching?! We refer to decision problems like this one as "One (test) and Done" decision making - Characterized by one grab at information before making a treatment decision - Primary concern: Is the posterior probability such that one should treat or withhold treatment? 19 20

11 IDENTIFYING THE PROBABILITY ABOVE WHICH ONE SHOULD TREAT AND BELOW WHICH ONE SHOULD WITHHOLD TREATMENT! Can identify a probability (p*) where the expected costs are equal - If after testing one is above p*, one should treat; if after testing one is below p*, one should withhold treatment Definitions: O TP = Value of outcome given true positive O FN = Value of outcome given false negative O TN = Value of outcome given true negative O FP = Value of outcome given false positive O TP - O FN = C FN = Cost of false negative O TN - O FP = C FP = Cost of false positive Expected outcome of treatment (EO Treat ) EO Treat = po TP + (1-p)O FP Expected outcome of no treatment (EO NoTreat ) EO NoTreat = po FN + (1-p)O TN Treatment threshold = p* = (EO Treat = EO NoTreat ) DERIVING THE TREATMENT THRESHOLD! The treatment threshold is defined as the probability of disease where the expected outcome of treatment equals the expected outcome of withholding treatment (see Appendix for more complete proof): [1] po TP + (1-p)O FP = po FN + (1-p)OTN [2] po TP + O FP - po FP = po FN + O TN - potn [3] (po TP - po FN ) + (po TN - po FP ) = (O TN - O FP ) [4] p[(o TP - O FN) + (O TN - O FP )] = (O TN - O FP ) [5] p (C FN + C FP ) = CFP [6] Threshold = p* = C FP / (C FN + C FP )! As used here, C FN and C FP are being considered in a cost-benefit framework (i.e., each includes costs and the monetized value of benefits)! Under a cost-effectiveness framework (in which C FN equals a combination of c FN and e FN and C FP equals a combination of c FP and e FP ): Re + c c FP FP p* = R c (e FP + e FN ) + (c FP + c FN ) 21 22

12 DECISION MODE 2! "House" -- Continuous Updating -- decision making - Patient has a problem that eventually needs to be treated (potentially urgently) - Have time to throw every conceivable expensive test at the patient - Although House never uses the term "posterior probability," continuously update one's probabilities after each round of testing! At the end of the "House" testing process, one still uses the same decision criterion as One and Done decision making: (p #/$ C FP / [C FN + C FP ])! However, in "House" decision making, not simply concerned about >p* / <p* - Rather, the magnitude of the difference between one's posterior and p* is important! Because distance from p* matters, one doesn't want to combine all "positive" test results and all "negative" test results, because doing so dilutes the value of the test's information WHICH DECISION MODE, WHEN?! When are medical decisions more like One and Done and when are they more like Continuous Updating? - One and Done * Life and death * One chance only to affect outcome (homeless, AIDS patients, etc.) * Concern about patient lack of follow-up * [High degree of correlation among tests] - Continuous Updating * Not life and death * Good continuity of care * Patient likely to follow-through with moderately complex diagnostic/treatment plan 23 24

13 SUMMARY OF MODULE 2! One can think of two types of decision making, which are characterized by whether one must make a treatment decision immediately or whether one has time to pursue a testing sequence! In One and Done decision making, principal concern is whether the posterior probability is above or below p*, the treatment threshold! In Continuous Updating decision making, the distance from p* matters (because one continues to update one's posterior by use of other tests) MODULE 2 APPENDIX: MORE COMPLETE DERIVATION OF TREATMENT THRESHOLD! Notation - Costs among D+,Rx+: C D+,Rx - Costs among D+,Rx-: C D+,Rx + C FN - Costs among D-,Rx-: C D-,Rx- - Costs among D-,Rx+: C D-,Rx- + C FP - Probability of disease: p! Costs of empiric therapy (p C D+,Rx ) + ((1-p) (C D-,Rx- + C FP )) =! Costs of withholding therapy p C D+,Rx - p C D-,Rx - p C FP + C D-,Rx- + C FP (p (C D+,Rx + C FN )) + ((1-p) C D-,Rx- ) =! Set the two equal: p C D+,Rx + p C FN - p C D-,Rx- + C D-,Rx- p C D+,Rx - p C D-,Rx - p C FP + C D-,Rx- + C FP =! Cancel: p C D+,Rx + p C FN - p C D-,Rx- + C D-,Rx- p C D+,Rx - p C D-,Rx - p C FP + C D-,Rx- + C FP = p C D+,Rx + p C FN - p C D-,Rx- + C D-,Rx

14 = (1-p) C FP = p C FN Rearrange: C FP - pxc FP = p C FN C FP = p C FP + pxc FN C FP = p (C FP + C FN ) p* = C FP / (C FP + C FN ) 27 28

15 MODULE 3. ONE (TEST) AND DONE DECISION MAKING! Review: This mode of decision making is characterized by one grab at information before making a treatment decision - Primary concern: Is the posterior probability such that one should treat or withhold treatment? - Doesn't matter how much above or below the treatment threshold one is 2x2 APPROACH! Combines data on prior probability of disease (p), costs of false positive and false negative results, and the operating characteristics of the test to identify an optimal 2x2 table for patients presenting with specific sets of signs and symptoms (i.e., specific prior probabilities of disease)! Either: - Combines probability of disease with the operating characteristics of the optimal 2x2 table to calculate a post-test (posterior) probability of disease * Treatment decision based on whether the posterior probability is below or above the treatment threshold - OR recommends treatment if test is positive and withholding treatment if test is negative OPTIMAL 2X2 TABLE MINIMIZES THE EXPECTED COST OF MISTAKES! Expected costs of mistakes determined by: - Frequency with which mistakes occur p (1-se) + (1-p)(1-sp) where p = the prior or pretest probability of disease - Cost of mistakes when they occur C FN ; C FP DEFINITION OF THE COSTS OF MISTAKES! The difference in the net value of treating someone correctly (i.e., treating a person with disease or withholding treatment from a person without disease) and the net value of treating them incorrectly (i.e., withholding treatment from a person with disease or treating a person without disease)! Can be estimated by use of a cost-benefit framework (monetizing both costs and outcomes)! Also can be estimated by use of a cost-effectiveness (NMB) framework: - C FN = (R c e FN ) + c FN - C FP = (R c e FP ) + c FP 29 30

16 RATIO OF COSTS SUFFICIENT! Many people are uncomfortable with identifying the costs of false positive and false negative decisions, particularly their absolute magnitudes! Good news: The ratio of these costs is more important than their absolute magnitudes - e.g., when defining the treatment threshold, if one is able to say that the cost of false positives is 1/3 the cost of false negatives, we know that: C FP = 1/3 C FN º p* = 1/3 C FN / ((1/3 C FN ) + C FN ) = 0.25 TREATMENT THRESHOLD AND COSTS! Given the relationship between the treatment threshold, C FP, and C FN, if one has an idea of one's treatment threshold, one can infer her relative valuation of C FP and C FN! If one's p* = 0.25, then: 0.25 = C FP / (C FN + C FP ) 0.25C FN C FP = C FP EXPECTED COSTS OF MISTAKES ASSOCIATED WITH THE NTH 2x2 TABLE! CM n = p C FN (1-se n ) + (1-p) C FP (1-sp n )! Thus, one means of identifying the optimal 2x2 table is to calculate CM for each possible table - If p =.2, C FN = 5 and C FP = 2.5 (i.e., p C FN = 1, and (1-p) C FP = 2), the WBC 2x2 table that minimizes the costs of mistakes is: Cutoff 1-Sens 1-Spec Cost of Mistakes > x x 2 = > x x 2 = > x x 2 = > x x 2 = > x x 2 = > x x 2 = i.e., given p, C FP, and C FN, the 2x2 table that minimizes the expected costs of mistakes uses a cut-off of > 20 as a positive test 0.25C FN = 0.75C FP 1/3C FN = C FP! Implication: One is implicitly making assumptions about C FP and C FN whenever one makes a treatment decision 31 32

17 OPTIMAL TRADE-OFF BETWEEN SENSITIVITY AND SPECIFICITY! Don't usually select an optimal 2x2 table by calculating the expected costs of mistakes for all of the candidate tables (although for small number of tables is relatively easy)! More commonly recommended strategy: 1. Define the trade-off between sensitivity and specificity that maintains a constant expected costs of mistakes (referred to as the "optimal" operating slope) 2. Develop a family of lines with this slope, each of which has a different expected cost of mistakes 3. Identify the tangency between this family of lines and the test's ROC curve º This tangency defines the optimal 2x2 table, and represents the cut-off that has the lowest expected costs of mistakes 1. DEFINE THE TRADE-OFF BETWEEN SENSITIVITY AND SPECIFICITY THAT MAINTAINS A CONSTANT EXPECTED COSTS OF MISTAKES! Rearrange the expected cost of mistakes formula:! To obtain the following formula: CM j = p C FN (1-se ij ) + (1-p) C FP (1-sp ij ) C (1 - p) se = (1 - sp ) + b fp ij ij j C fn p CM j where b j = 1 - p C fn º One defines a line with a fixed costs of mistakes (CM j ) when one trades-off sensitivity and 1-specificity by use of the following "optimal operating slope" (OOS): OOS = (1-p) C FP / p C FN - i.e., the OOS trades-off sensitivity and specificity in proportion to 1) the size of the population among whom false positive and false negative mistakes can be made ([1-p]/p) and 2) the costs of these mistakes when they occur (C FP / C FN )! e.g., if p =.2, C FN = 5 and C FP = 2.5 (i.e., (1-p) C FP = 2 and p C FN = 1), then the OOS =

18 2. DEVELOP A FAMILY OF LINES WITH OPTIMAL TRADE-OFF IDENTIFY THE TANGENCY BETWEEN THIS FAMILY OF LINES AND THE TEST'S ROC CURVE 1.00 Costs of mistakes decreasing SENSITIVITY SENSITIVITY SPECIFICITY! One generates a family of lines all with the optimal operating slope (2.0), each of which has a constant cost of mistakes! The costs of mistakes at the intercept of each line equals the cost of mistakes at any point on the line p C FN (1-Intercept) e.g., if intercept =.225: CM.225 =.2 x 5 x (1-.225) = 0.775! Lines with larger intercepts have lower costs; those with smaller intercepts have higher costs e.g., if intercept = 0.6: CM.6 =.2 x 5 x (1-.6) = SPECIFICITY! Optimal 2x2 table defined by the tangency at the >20 operating point (where the costs of mistakes are.775)! Other feasible operating points have higher costs - e.g., the line that intersects the ROC curve at the >25 operating point has an intercept of.171 (a cost of.829); the line that intersects the ROC curve at the >15 operating point has an intercept of (a cost of.808)! Would prefer to operate at lines 3 or 4, which have lower costs of mistakes, but these lines are beyond the test's capabilities 35 36

19 INTUITION BEHIND WHY ONE WANTS A TANGENCY BETWEEN OOS AND ROC CURVE! The goal is to identify the cut-off that minimizes the cost of mistakes! Among the family of lines defined by the OOS, the one with the highest obtainable intercept has the lowest cost of mistakes (because p and C FN are characteristics of the patient and do not change, and 1- intercept (1-SE) is minimized)! This highest obtainable intercept is defined by the tangency and determined by characteristics of the test BONEHEAD METHOD FOR DETERMINING TANGENCY 1. Compute the optimal operating slope (e.g., 2) 2. Compute the slopes of the lines connecting each of the contiguous operating points on the ROC curve (i.e., the slopes of the lines connecting the dots representing the operating points) Cut-off Se i - Se i-1 1-Sp i - 1-Sp i-1 ROC Slope > / > 20,< / > 15,< / > 10,< / BONEHEAD METHOD FOR DETERMINING TANGENCY (cont.) Cut-off Se i - Se i-1 1-Sp i - 1-Sp i-1 ROC Slope > / > 20,< / > 15,< / > 10,< / < / Identify a tangency - If the OOS is equal to the slope of a line between two contiguous operating points, use either of the operating points (e.g., if the slope equals 3.080, use either >20 or >25) - If the OOS is greater than one slope, but less than the contiguous slope, use the operating point that divides the 2 slopes to define a positive test * e.g., an OOS of 2 is greater than but less than 3.080, so use the common cut-off that defines these two slopes: +, >20; -, <20 * At this operating point, the optimal sensitivity and specificity are and < /

20 INTERPRETING THE TEST RESULTS! One can use the sensitivity (or LR+) and specificity (or LR-) of the optimal 2x2 table to calculate posterior probabilities and compare them with the treatment threshold (p* =.333 = 2.5 / [ ]) - If p = 0.2 & result >20: Se: (.2 x.385) / {(.2 x.385) + (.8 x.08)} =.546 LR+: (.2 x ) / {(.2 x ) +.8 } = If p = 0.2 & result <20:! Alternatively: (1-Se): (.2 x.615) / {(.2 x.615) + (.8 x.92)} =.1432 LR-: (.2 x ) / {(.2 x ) +.8} = Treat if test result is positive, because the posterior derived from a positive test from the optimal 2x2 table is always above the treatment threshold (demonstration later in notes) - Withhold treatment if the test result is negative, because the posterior from a negative test is always below the treatment threshold SSLR APPROACH! For "One and Done" decision making either: - Calculate posterior probability of disease based on test result, and make treatment decision by comparing posterior with treatment threshold (0.333) Test Result < 10 SSLR prior i PPD i = ( SS LR i prior) + (1-prior) ( 0.2 x 0.179) / [ (0.2 x 0.179) ] = Test Result >10, <15 ( 0.2 x 0.792) / [ (0.2 x 0.792) ] = Test Result >15, <20 ( 0.2 x 1.792) / [ (0.2 x 1.792) ] = Test Result >20, <25 ( 0.2 x 3.070) / [ (0.2 x 3.070) ] = Test Result >25 (0.2 x 7.617) / [ (0.2 x 7.617) ] = OR, as with the 2x2 approach, simply determine if test result is >20 / <20 (demonstration later in notes) 39 40

21 REVISITING SELECTION OF THE OPTIMAL 2X2! You may have noticed that if one compares method 3 for defining SSLR and the method one uses to define the slopes of the ROC curve, the two methods are identical - i.e., the slopes of the ROC curve = the test's SSLR IMPLICATION! IF the SSLR represent the slopes of the lines between contiguous operating points on the curve, AND! IF one identifies the cut-off for a positive test by comparing the optimal operating slope to these slopes, THEN: Slope = Slope = One need not construct an ROC curve to identify the cut-off for a positive test, but instead can calculate SSLR and compare the OOS to them SENSITIVITY Slope = Slope = Slope = SPECIFICITY Cut-off SSLR / Slope > > 20, < > 15, < > 10, < < If the optimal operating slope is 2, what cut-off should one use for a positive test? º Can identify "positive" SSLR; thus, when using SSLR for "One and Done" decision making, don't need to calculate posterior probability 41 42

22 RESULTS AN ACCIDENT?! Was it an accident that:! NO! - Stratum-specific results that yielded posterior probabilities above the treatment threshold were classified as positive tests? - Stratum-specific results that yielded posterior probabilities below the treatment threshold were classified as negative tests? - Can show that for the optimal 2x2 table that's what we mean by positive and negative tests RESULTS AN ACCIDENT? (II)! Step 1. If a stratum-specific result yields a posterior probability above the treatment threshold, we know: sslr i p CFP > (sslr p) +(1-p) C + C i FP FN - Multiplying through by the denominators, canceling, and rearranging yields: (1-p) C sslr i > p C! Step 2. If a stratum-specific result is classified as a negative test, we know: FN FP se - se sp - sp = sslr < (1-p) C i i-1 FP i i i-1 p CFN Q.E.D. Proof by contradiction; the SSLRi cannot be both greater than and less than the optimal operating slope! One can develop an analogous proof to show that if a stratumspecific result yields a posterior probability below the treatment threshold, it must be classified as a negative test 43 44

23 DEMONSTRATION OF RELATIONSHIPS! Posterior probabilities given 3 priors and the 5 WBC SSLR SUMMARY, "ONE AND DONE" DECISION MAKING! In the optimal 2x2 table, stratum-specific results whose posterior probabilities of disease are above the treatment threshold are classified as positive tests ^ <10 Prior = >25 Prior = ^ Prior ^= 0.30! Stratum-specific results whose posterior probabilities of disease are below the treatment threshold are classified as negative tests! Thus, in One and Done decision making, in which one's treatment decision is based on whether one's posterior is >p*/<p*, use of optimal 2x2 tables and SSLR yield identical treatment decisions Negative Test Positive Test Probability of Disease Cfp = Cfn; (Cfp / (Cfp + Cfn)) = 0.5 = Threshold Likelihood ratios defined for WBC for bacteremia ^ Strata yielding posteriors above the threshold OOS and Tangency Prior Strata OOS Tangency 0.30 > > >15 or > >15 or > > >

24 MODULE 4. CONTINUOUS UPDATING OF PROBABILITIES (HOUSE) DECISION MAKING! Review: This mode of decision making is characterized by multiple testing and continuous updating of one's probabilities after each round of testing - At the end of the testing process, one still uses the same decision criterion as One and Done (p #/$ C FP / [C FN + C FP ]) - However, one isn't simply concerned about >p* / <p* after the first round of testing IL-6: A SECOND TEST FOR BACTEREMIA IL-6 Level w/ Bact w/o Bact Sens Spec Sensitivity and specificity > > > > * Instead concerned about how much the posterior is above or below p* Strata SSLR > < Total Strait RJ, Kelly KJ, Kurup RP. Tumor necrosis factor-", interleukin-1$, and interleukin-6 levels in febrile, young children with and without occult bacteremia. Pediatrics. 1999;104:

25 ROC CURVE, IL-6 FOR BACTEREMIA (Strait) 2x2 AND SSLR APPROACHES AND POSTERIOR PROBABILITIES, IL- 6 EXAMPLE # <10 2! The 2x2 and SSLR approaches yield different posterior probabilities of disease, particularly because of the combining of strata in the 2x2 approach SENSITIVITY <10 3 C.O. * SENS. 1-SPEC. > > > > ! Suppose the optimal 2x2 table combines and >10 3 strata into a positive test (e.g., an OOS of 2.5) SSLR, >10 3 : > SPECIFICITY LR+, >10 2 : (17 x 46) / (11 x 22) = SSLR, : The posterior probability from the 2x2 approach shifts in the correct direction for both test results - However, for , the 2x2 approach leads to too great an increase in posterior probability, while for >10 3, it leads to too little an increase 49 50

26 2x2 AND SSLR APPROACHES AND POSTERIOR PROBABILITIES, IL- 6 EXAMPLE #2! Suppose the optimal 2x2 table combines and <10 2 strata into a negative test (e.g., an OOS of 3.5) SSLR, : LR-, <10 3 : 18 x 46 / (44 x 22) = SSLR, <10 2 : Summary "House" Decision Making! If one is ordering multiple tests and continuously updating one's probabilities, the optimal 2x2 and SSLR yield different probabilities, and potentially different treatment decisions! In the SSLR approach, some test results may yield posterior probabilities equal to priors; in the optimal 2x2 approach, either all test results yield posteriors that differ from one's prior, or all posteriors equal one's priors - For test results <10 2, the posterior probability from the 2x2 approach shifts in the correct direction, although not nearly enough - For test results of , however, the 2x2 approach shifts the posterior probability in the wrong direction SSLR = 1.0! Using SSLR, a test can be useful even if some test results yield posterior probabilities equal to one's priors (i.e., if one stratum has an SSLR of 1.0)! In the 2x2 approach, if a positive or negative test result for a particular table yields posteriors equal to one's priors, 1) the other test result does so as well, 2) the ROC curve falls on the 45 line, and 3) the test provides no information 51 52

27 MODULE 5. SUMMARY. 2x2 VS STRATUM-SPECIFIC APPROACHES (I) 2x2 Approach Stratum-Specific Approach! Combines proportions of the populations having test results in different strata to develop likelihood ratios for positive and negative tests º For some patients, a test result will yield posterior probabilities that are higher than one's prior, while for other patients the same test result will yield posterior probabilities that are lower than one's prior! Does not average among strata (but does average within a stratum) º A given test result yields a posterior that is either always higher or always lower than one's prior MODULE 5. SUMMARY. 2x2 VS STRATUM-SPECIFIC APPROACHES (I) 2x2 Approach! Cost of mistakes and thresholds built into the definition of a positive test! Can use Bayes theorem or likelihood ratio approach to adjust prior probabilities! Withhold testing for therapeutic decisions if no stratum-specific result can shift the posterior and prior to opposite sides of the underlying treatment threshold Stratum-Specific Approach! Cost of mistakes and thresh-olds built into the definition of a positive stratum! Can use Bayes theorem or likelihood ratio approach to adjust prior probabilities! Withhold testing for therapeutic decisions if no stratum-specific result can shift the posterior and prior to opposite sides of the underlying treatment threshold! All strata whose results leave one above the treatment threshold will be classified as positive and all strata that leave one below the threshold will be classified as negative º If the only decision remaining is to treat or withhold treatment, the two approaches yield the same result; if other choices are available, the results can differ! Strata always have the same likelihood ratio (Can determine which strata are positive and which strata are negative by comparison to OOS) º If the only decision remaining is to treat or withhold treatment, the two approaches yield the same result; if other choices are available, the results can differ! So long as there is a mapping function between the optimal operating point on the maximum likelihood estimate of the ROC curve and the criteria for a positive test, the 2x2 approach does not have to prespecify particular cut- points for the diagnostic test results! Ideally, the likelihood ratios are based on the proportions of tests among diseased and nondiseased among infinitesimally small strata (i.e., on the relationships between the two density functions)! Retains concept of positive test! Retains concept of positive test result if one compares the SSLR to the OOS 53 54

28 DO THE DIFFERENCES MAKE A DIFFERENCE! For continuous tests, the likelihood ratio approach provides a more accurate estimate of the posterior probability of disease than does the 2 x 2 approach! However, if no other test is available or if every SSLR moves you outside the threshold for additional testing - The two methods yield the same treatment decision, because they both leave you on the same side of the underlying treatment threshold - This conclusion is true even if the likelihood ratios from the 2x2 approach and the SSLR are on opposite sides of 1.0 DO NOTHING/TEST & TEST/TREAT THRESHOLDS! If some stratum-specific results leave you within the testing range and others move you outside it, the 2x2 approach will yield more mistakes - SSLR moves you outside testing range, but dilution from 2x2 combination of test results leaves you within the testing range * More likely when there are extreme results yielding high SSLR and these SSLR are averaged with other smaller SSLR in 2x2 approach IL-6 < 2, SSLR ^ IL-6 < 3 LR- ^ ^ Prior = Watchfully wait Test Treat empirically SSLR, IL-6 < 2: LR-, IL-6 < 3: Alternatively, SSLR may leave you within the testing range (e.g., for IL-6 between 2 and 3), but overstatement of the effect of the test result from the LR+ (3.231 for IL-6 > 2) can move you outside this range 55 56

29 MODULE 6: CHOICE BETWEEN TESTS SSLR ALONE NOT SUFFICIENT FOR CHOICE! If one wants to perform one and only one of two tests, how should one choose between the two?! Methods reasonably well worked out for 2x2 approach; less developed for SSLR approach SSLR AND CHOICE! Suppose you are comparing 2 tests - Each has 3 strata SENSITIVITY The 3 SSLR for test 1 are identical to the 3 SSLR for test 2 - The cost of the tests themselves (i.e., technician, reagents, etc.), the cost of delay in treatment due to the tests, and the risk from tests themselves (e.g., adverse reactions) are identical! Can one test be better than the other, and if so, do the SSLR alone provide the information one needs to choose between the tests? SPECIFICITY! The 3 SSLR for the "inner" dotted test are identical to the 3 SSLR for the "outer" dashed test! Thus, YES, one test can be better than the other - The fact that two tests share SSLRs doesn't necessarily mean they have the same sensitivities and specificities! NO, the SSLR alone provide insufficient information to choose between the tests - Because SSLR are identical, but the tests differ in the clinical information they provide 57 58

30 CHOICE CRITERIA! Obtain maximum clinical information at the lowest cost! Elements of decision - Value of clinical information (posterior probabilities that either appropriately or inappropriately are above or below the prior probability of disease) - Cost of the test itself (i.e., technician, reagents, etc.) - Cost of delay in treatment due to the test - Risk of test! Discuss the first element, value of clinical information, because it varies within a single test based on the test characteristics one selects - The remaining three elements are test specific, but are constant for different test characteristics CHOICE FOR WHOM?! Is the choice being made for an individual patient - In the 2x2 approach, one is choosing between single operating points from two tests for "like" patients (where "like" means equivalent OOS)! Is the choice being made for a formulary (i.e., which one of several tests do we want to make available to clinicians for testing for a given disease?) - In the 2x2 approach, one is choosing between multiple pairs of operating points, one from each of two tests, where each pair represents a set of "like" patients * e.g., 25% of patients may require an OOS of 0.25; 50% may require an OOS of 1; and 25% may require an OOS of 4 TWO CHOICE METHODS! Compare tests' ROC area! Compare tests' intercepts of tangent OOS 59 60

31 METHOD 1. ROC AREA! Commonly reported summary measure of a test's discriminating ability (e.g., c-statistic in SAS; lroc after "logit" in Stata) - Discrimination: the ability to give different scores to those with and without disease * e.g., to assign generally lower scores to those without disease and to assign generally higher scores to those with the disease! Technically, the area equals the probability that the test will correctly rank any randomly selected pair of persons, one of whom has the outcome of interest and the other of whom does not (equivalent to WIlcoxon p-value) ROC AREA (II)! Areas under the ROC curve range between 0.5 and = no ability to discriminate risk: scores are distributed similarly across those who do and do not have the outcome = perfect discrimination: scores of all persons who have the outcome are higher (lower) than scores of any person who does not have the outcome METHOD 2. INTERCEPT OF TANGENT OOS AND ROC CURVE! Costs of mistakes for any operating point defined by the intercept of the tangent OOS and ROC curve and equal: P C FN (1-Intercept)! Comparison of intercepts of the tangent OOS for the two tests provides a measure of the difference in the costs of mistakes made by use the two tests - Costs of mistakes can be combined with the cost of tests themselves, the cost of delay in treatment due to the tests, and the risk from tests to identify the appropriate test 2x2 CHOICE FOR THE INDIVIDUAL. ROC AREA! Common method proposed in literature for comparing tests (i.e., select the test with greatest area)! Inappropriate for choices among tests for the individual, given that these choices focus on the clinical information contained in the optimal 2x2 table, not in the set of possible 2x2 tables! Difference in area not a measure of costs of mistakes, and cannot be used to generate such a measure! Little systematic information is available about the benefit of small increases in the area under the ROC curve (i.e., an increase from 0.75 to 0.77). However, prediction rules with larger areas in general are more discriminating than are rules with smaller areas 61 62

32 DIFFICULTIES INTERPRETING ROC AREAS! Case 1: The curves cross 1.00 DIFFICULTIES INTERPRETING ROC AREAS (II)! Case 2: One curve is always to the left and above the other, and the difference in areas either is or is not statistically significant Sensitivity 0.50 Sensitivity P =??? Specificity P =??? Specificity - If the curves cross, each test likely has some operating points appropriate for certain individuals, whether or not the ROC area of one test is the same, less than, or greater than the area of another - If significantly different, doesn't rule out insignificant differences for some operating points - If not significantly different, doesn't rule out significant differences for some operating points 63 64

33 ROC AREA (II)! Suppose we wish to choose between WBC and IL-6 for a patient for whom we believe the prior for bacteremia = 0.2 and the ratio of C FP to C FN = x2 CHOICE FOR THE INDIVIDUAL. COMPARISON OF INTERCEPTS (I)! Suppose we wish to choose between WBC and IL-6 for a patient for whom we believe the prior = 0.2 and the ratio of C FP to C FN = 0.25 (i.e., OOS = 1)! One obtains the following 2 tangencies 0.80 SENSITIVITY ROC Areas: WBC, IL-6, SPECIFICITY! What information does the following Stata output provide for choosing between WBC and IL-6? ROC -Asymptotic Normal-- test Obs Area Std. Err. [95% Conf. Interval] Ho: area(0) = area(1) chi2(1) = 0.01 Prob>chi2 = WBC IL-6 SENSITIVITY WBC 0.50 IL ROC Areas: WBC, IL-6, SPECIFICITY! Difference in clinical information indicated by arrows on Y axis 65 66

34 ASSESSING THE MAGNITUDE OF THE DIFFERENCE BETWEEN 2 INTERCEPTS: EQUATIONS! We can use the formula for the tangent lines to compare the difference in the expected costs of mistakes yielded by two tests (e.g., one with a higher sensitivity [H] and one with a lower sensitivity [L])! The formula for the tangent lines is given by: ASSESSING THE MAGNITUDE OF THE DIFFERENCE BETWEEN 2 INTERCEPTS: Example! Test characteristics at tangency Test Sens Spec IL-6, WBC, Rearranging: Thus: Sens i = (OOS [1-Spec i ]) + Int i Int i = Sens i - (OOS [1-Spec i ]) Int H - Int L = ( ) + 1 ( ) = (Sens H - Sens L ) + OOS (Spec H - Spec L )! Given that at the intercept, mistakes are only made among people with disease, the value of this difference is: Int H = Sens H - OOS (1-Spec H ) Int L = Sens L - OOS (1-Spec L ) Int H - Int L = Sens H - Sens L + OOS (Spec H - Spec L ) 0.2 C FN = C FN i.e., for this patient, IL-6 reduces the costs of mistakes by C FN 67 68

35 ASSESSING THE SIGNIFICANCE OF THE DIFFERENCE BETWEEN 2 INTERCEPTS! Can test the statistical significance of this difference by bootstrapping the intercepts of the tangent lines (programs available at Test Intercept SE IL WBC Difference p-value 0.70! difference not significant, in part because of small numbers of patients with bacteremia in the WBC (N = 26) and IL-6 (N = 22) studies 2x2 CHOICE FOR THE FORMULARY. ROC AREA! Information of interest: Costs of mistakes arising from multiple pairs of operating points one from each of two tests! ROC area potentially more appropriate for choices among tests for the formulary (i.e., across all possible operating points), but: - Would need to use all the potential test operating points equifrequently (unlikely) - Doesn't provide a quantitative measure of costs of mistakes CHOICE FOR THE FORMULARY. COMPARISON OF INTERCEPTS (I)! Suppose one used the two tests to diagnose 2 types of patients, one type with an OOS of 0.5, the other with an OOS of 2.0? 1.00 SUMMARY, CHOICE FOR THE INDIVIDUAL! The comparison of interest is between single operating points on two ROC curves! Differences in areas refer to the set of possible 2x2 tables, not to the optimal 2x2 table, and provide no quantitative measure of the difference in cost of mistakes between the two tests! Comparisons of intercepts of OOS evaluate the two single operating points and provide a quantitative measure (and statistical test) of the difference in cost of mistakes Sensitivity P =??? Specificity 69 70

36 CHOICE FOR THE FORMULARY. COMPARISON OF INTERCEPTS (II)! The weighted average of differences in the intercepts for each OOS (where weights are determined by expected frequency of use of each OOS) represents the quantitative measure of the difference in two tests' expected costs of mistakes! Thus, comparison of the intercepts for each optimal operating slope allows the determination of the better test as well as the quantification of the costs of mistakes for each test SUMMARY, CHOICE FOR THE FORMULARY! The comparison of interest is between the relevant (potentially all) pairs of operating points on two ROC curves! In a limited set of circumstances, comparison of areas under the ROC curves of two tests may identify the better test, but does not allow quantification of the difference in the tests' cost of mistakes COMPARISON OF TESTS SUMMARIZED WITH LIKELIHOOD RATIOS (I)! Assume a simple definition of costs (as we did above) - The cost of mistakes is the same for any posterior probability that for nondiseased individuals is above the underlying treatment threshold * i.e., mistakes due to posteriors marginally above the threshold have costs similar to mistakes due to posteriors substantially above the threshold - The cost of mistakes is the same for any posterior probability that for diseased individuals is below the underlying treatment threshold * i.e., mistakes due to posteriors marginally below the threshold have costs similar to mistakes due to posteriors substantially below the threshold! A weighted average of comparisons of intercepts for pairs of operating points provides a quantitative measure (and statistical tests) of the difference in the costs of mistakes 71 72

37 COMPARISON OF TESTS SUMMARIZED WITH LIKELIHOOD RATIOS (II)! Given the proof that the 2 2 table identified by use of the tangency of the OOS and the ROC curve classifies any stratum with a stratumspecific likelihood ratio that yields a posterior probability above the underlying treatment threshold as a positive test and classifies any stratum with a stratum-specific likelihood ratio that yields a posterior probability below the underlying treatment threshold as a negative test: - The cost of mistakes in the stratum-specific approach (as defined above) is identical to the cost of mistakes in the approach comparing intercepts of tangent lines - The comparison of tests for either the individual or for the formulary based on the intercepts of optimal operating slopes tangent to ROC curves can be used when the test is being characterized as a series of stratum-specific likelihood ratios COMPARISON OF TESTS SUMMARIZED WITH LIKELIHOOD RATIOS (III)! Suppose we identified a more complex definition of the cost of mistakes (which quantifies the costs of absolute increases in the posterior probability of disease for people without disease and the costs of absolute decreases in the posterior probability of disease for people with disease)! In this case, the total costs of mistakes for a test are defined as follows: [ 3 i p P SSi Dis (p - [p SSLR i / ([p SSLR i ]+[1-p])]) C FN >p ] + [ 3 j (1-p) P SSj NoDis ([p SSLR j / ([p SSLR j ]+[1-p])] - p) C FP >p ] where i = strata with SSLR < 1; p = prior probability of disease; P SSi Dis = the probability that individuals with disease will have a test result falling within stratum i; SSLR i = the likelihood ratio for test results in stratum i; C FN >p = the cost of false negative results given the reduction in the probability of disease due to an LR < 1; j = strata with SSLR >1; P SSj NoDis = the probability that individuals without disease will have a test result falling within stratum j; SSLR j = the likelihood ratio for test results in stratum j; C FP >p = the cost of false positive results given the increase in the probability of disease due to an LR >

38 UNDERSTANDING THE FORMULA Prob lower Prob dis posterior ( Change in p ) Cost [ 3 i p P SSi Dis (p - [p SSLR i / ([p SSLR i ]+[1-p])]) C FN >p ] PLUS Prob Prob higher nodis posterior ( Change in p ) Cost [ 3 j (1-p) P SSj NoDis ([p SSLR j / ([p SSLR j ]+[1-p])] - p) C FP >p ] SUMMARY MODULE #6! When selecting among tests, one should maximize clinical information while minimizing cost! If one is choosing among tests for the individual, one is choosing between single operating points from two tests for "like" patients! If one is choosing among tests for the formulary, one is choosing between multiple pairs of operating points one from each of two tests, where each pair represents a set of "like" patients! Comparison of intercepts of the tangent OOS for the two tests provides a measure of the difference in the costs of mistakes made by use of the two tests as well as a statistical test of this difference COMPARISON OF TESTS SUMMARIZED WITH LIKELIHOOD RATIOS (IV)! Finally, one can quantify both the costs of mistakes from absolute increases in the posterior probability of disease for people without disease and the costs of absolute decreases in the posterior probability of disease for people with disease (as above) as well as the benefits from correct increases and decreases in the posterior probability due to the test results 75 76

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

METHODS FOR DETECTING CERVICAL CANCER

METHODS FOR DETECTING CERVICAL CANCER Chapter III METHODS FOR DETECTING CERVICAL CANCER 3.1 INTRODUCTION The successful detection of cervical cancer in a variety of tissues has been reported by many researchers and baseline figures for the

More information

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015 Introduction to diagnostic accuracy meta-analysis Yemisi Takwoingi October 2015 Learning objectives To appreciate the concept underlying DTA meta-analytic approaches To know the Moses-Littenberg SROC method

More information

THERAPEUTIC REASONING

THERAPEUTIC REASONING THERAPEUTIC REASONING Christopher A. Klipstein (based on material originally prepared by Drs. Arthur Evans and John Perry) Objectives: 1) Learn how to answer the question: What do you do with the post

More information

The recommended method for diagnosing sleep

The recommended method for diagnosing sleep reviews Measuring Agreement Between Diagnostic Devices* W. Ward Flemons, MD; and Michael R. Littner, MD, FCCP There is growing interest in using portable monitoring for investigating patients with suspected

More information

Sensitivity, Specificity and Predictive Value [adapted from Altman and Bland BMJ.com]

Sensitivity, Specificity and Predictive Value [adapted from Altman and Bland BMJ.com] Sensitivity, Specificity and Predictive Value [adapted from Altman and Bland BMJ.com] The simplest diagnostic test is one where the results of an investigation, such as an x ray examination or biopsy,

More information

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN Outline 1. Review sensitivity and specificity 2. Define an ROC curve 3. Define AUC 4. Non-parametric tests for whether or not the test is informative 5. Introduce the binormal ROC model 6. Discuss non-parametric

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

Interval Likelihood Ratios: Another Advantage for the Evidence-Based Diagnostician

Interval Likelihood Ratios: Another Advantage for the Evidence-Based Diagnostician EVIDENCE-BASED EMERGENCY MEDICINE/ SKILLS FOR EVIDENCE-BASED EMERGENCY CARE Interval Likelihood Ratios: Another Advantage for the Evidence-Based Diagnostician Michael D. Brown, MD Mathew J. Reeves, PhD

More information

MITOCW conditional_probability

MITOCW conditional_probability MITOCW conditional_probability You've tested positive for a rare and deadly cancer that afflicts 1 out of 1000 people, based on a test that is 99% accurate. What are the chances that you actually have

More information

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

More information

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference Lecture Outline Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Statistical Inference Role of Statistical Inference Hierarchy of Experimental

More information

Diagnostic Reasoning: Approach to Clinical Diagnosis Based on Bayes Theorem

Diagnostic Reasoning: Approach to Clinical Diagnosis Based on Bayes Theorem CHAPTER 75 Diagnostic Reasoning: Approach to Clinical Diagnosis Based on Bayes Theorem A. Mohan, K. Srihasam, S.K. Sharma Introduction Doctors caring for patients in their everyday clinical practice are

More information

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T. Diagnostic Tests 1 Introduction Suppose we have a quantitative measurement X i on experimental or observed units i = 1,..., n, and a characteristic Y i = 0 or Y i = 1 (e.g. case/control status). The measurement

More information

Comparing Two ROC Curves Independent Groups Design

Comparing Two ROC Curves Independent Groups Design Chapter 548 Comparing Two ROC Curves Independent Groups Design Introduction This procedure is used to compare two ROC curves generated from data from two independent groups. In addition to producing a

More information

Statistics, Probability and Diagnostic Medicine

Statistics, Probability and Diagnostic Medicine Statistics, Probability and Diagnostic Medicine Jennifer Le-Rademacher, PhD Sponsored by the Clinical and Translational Science Institute (CTSI) and the Department of Population Health / Division of Biostatistics

More information

ROC Curve. Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341)

ROC Curve. Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) ROC Curve Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342 ROC Curve The ROC Curve procedure provides a useful way to evaluate the performance of classification

More information

Computerized Mastery Testing

Computerized Mastery Testing Computerized Mastery Testing With Nonequivalent Testlets Kathleen Sheehan and Charles Lewis Educational Testing Service A procedure for determining the effect of testlet nonequivalence on the operating

More information

Student Performance Q&A:

Student Performance Q&A: Student Performance Q&A: 2009 AP Statistics Free-Response Questions The following comments on the 2009 free-response questions for AP Statistics were written by the Chief Reader, Christine Franklin of

More information

Technical Specifications

Technical Specifications Technical Specifications In order to provide summary information across a set of exercises, all tests must employ some form of scoring models. The most familiar of these scoring models is the one typically

More information

ROC Curves. I wrote, from SAS, the relevant data to a plain text file which I imported to SPSS. The ROC analysis was conducted this way:

ROC Curves. I wrote, from SAS, the relevant data to a plain text file which I imported to SPSS. The ROC analysis was conducted this way: ROC Curves We developed a method to make diagnoses of anxiety using criteria provided by Phillip. Would it also be possible to make such diagnoses based on a much more simple scheme, a simple cutoff point

More information

SYSTEMATIC REVIEWS OF TEST ACCURACY STUDIES

SYSTEMATIC REVIEWS OF TEST ACCURACY STUDIES Biomarker & Test Evaluation Program SYSTEMATIC REVIEWS OF TEST ACCURACY STUDIES Patrick MM Bossuyt Structure 1. Clinical Scenarios 2. Test Accuracy Studies 3. Systematic Reviews 4. Meta-Analysis 5.

More information

Introduction to ROC analysis

Introduction to ROC analysis Introduction to ROC analysis Andriy I. Bandos Department of Biostatistics University of Pittsburgh Acknowledgements Many thanks to Sam Wieand, Nancy Obuchowski, Brenda Kurland, and Todd Alonzo for previous

More information

STATISTICS AND RESEARCH DESIGN

STATISTICS AND RESEARCH DESIGN Statistics 1 STATISTICS AND RESEARCH DESIGN These are subjects that are frequently confused. Both subjects often evoke student anxiety and avoidance. To further complicate matters, both areas appear have

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Section 3.2 Least-Squares Regression

Section 3.2 Least-Squares Regression Section 3.2 Least-Squares Regression Linear relationships between two quantitative variables are pretty common and easy to understand. Correlation measures the direction and strength of these relationships.

More information

Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges

Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges Research articles Type and quantity of data needed for an early estimate of transmissibility when an infectious disease emerges N G Becker (Niels.Becker@anu.edu.au) 1, D Wang 1, M Clements 1 1. National

More information

6. Unusual and Influential Data

6. Unusual and Influential Data Sociology 740 John ox Lecture Notes 6. Unusual and Influential Data Copyright 2014 by John ox Unusual and Influential Data 1 1. Introduction I Linear statistical models make strong assumptions about the

More information

1 Diagnostic Test Evaluation

1 Diagnostic Test Evaluation 1 Diagnostic Test Evaluation The Receiver Operating Characteristic (ROC) curve of a diagnostic test is a plot of test sensitivity (the probability of a true positive) against 1.0 minus test specificity

More information

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY

Evaluation Models STUDIES OF DIAGNOSTIC EFFICIENCY 2. Evaluation Model 2 Evaluation Models To understand the strengths and weaknesses of evaluation, one must keep in mind its fundamental purpose: to inform those who make decisions. The inferences drawn

More information

Title:Mixed-strain Housing for Female C57BL/6, DBA/2, and BALB/c Mice: Validating a Split-plot Design that promotes Refinement and Reduction

Title:Mixed-strain Housing for Female C57BL/6, DBA/2, and BALB/c Mice: Validating a Split-plot Design that promotes Refinement and Reduction Author's response to reviews Title:Mixed-strain Housing for Female C57BL/6, DBA/2, and BALB/c Mice: Validating a Split-plot Design that promotes Refinement and Reduction Authors: Michael Walker Mr (mwalk04@uoguelph.ca)

More information

Behavioral Data Mining. Lecture 4 Measurement

Behavioral Data Mining. Lecture 4 Measurement Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for

More information

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review

Results & Statistics: Description and Correlation. I. Scales of Measurement A Review Results & Statistics: Description and Correlation The description and presentation of results involves a number of topics. These include scales of measurement, descriptive statistics used to summarize

More information

Bayes theorem, the ROC diagram and reference values: Definition and use in clinical diagnosis

Bayes theorem, the ROC diagram and reference values: Definition and use in clinical diagnosis Special Lessons issue: in biostatistics Responsible writing in science Bayes theorem, the ROC diagram and reference values: efinition and use in clinical diagnosis Anders Kallner* epartment of clinical

More information

OCW Epidemiology and Biostatistics, 2010 Michael D. Kneeland, MD November 18, 2010 SCREENING. Learning Objectives for this session:

OCW Epidemiology and Biostatistics, 2010 Michael D. Kneeland, MD November 18, 2010 SCREENING. Learning Objectives for this session: OCW Epidemiology and Biostatistics, 2010 Michael D. Kneeland, MD November 18, 2010 SCREENING Learning Objectives for this session: 1) Know the objectives of a screening program 2) Define and calculate

More information

Critical reading of diagnostic imaging studies. Lecture Goals. Constantine Gatsonis, PhD. Brown University

Critical reading of diagnostic imaging studies. Lecture Goals. Constantine Gatsonis, PhD. Brown University Critical reading of diagnostic imaging studies Constantine Gatsonis Center for Statistical Sciences Brown University Annual Meeting Lecture Goals 1. Review diagnostic imaging evaluation goals and endpoints.

More information

Various performance measures in Binary classification An Overview of ROC study

Various performance measures in Binary classification An Overview of ROC study Various performance measures in Binary classification An Overview of ROC study Suresh Babu. Nellore Department of Statistics, S.V. University, Tirupati, India E-mail: sureshbabu.nellore@gmail.com Abstract

More information

CHAPTER 15: DATA PRESENTATION

CHAPTER 15: DATA PRESENTATION CHAPTER 15: DATA PRESENTATION EVIDENCE The way data are presented can have a big influence on your interpretation. SECTION 1 Lots of Ways to Show Something There are usually countless ways of presenting

More information

Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy

Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Chapter 10 Analysing and Presenting Results Petra Macaskill, Constantine Gatsonis, Jonathan Deeks, Roger Harbord, Yemisi Takwoingi.

More information

4 Diagnostic Tests and Measures of Agreement

4 Diagnostic Tests and Measures of Agreement 4 Diagnostic Tests and Measures of Agreement Diagnostic tests may be used for diagnosis of disease or for screening purposes. Some tests are more effective than others, so we need to be able to measure

More information

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Number XX An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Prepared for: Agency for Healthcare Research and Quality U.S. Department of Health and Human Services 54 Gaither

More information

sickness, disease, [toxicity] Hard to quantify

sickness, disease, [toxicity] Hard to quantify BE.104 Spring Epidemiology: Test Development and Relative Risk J. L. Sherley Agent X? Cause Health First, Some definitions Morbidity = Mortality = sickness, disease, [toxicity] Hard to quantify death Easy

More information

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests

Objectives. Quantifying the quality of hypothesis tests. Type I and II errors. Power of a test. Cautions about significance tests Objectives Quantifying the quality of hypothesis tests Type I and II errors Power of a test Cautions about significance tests Designing Experiments based on power Evaluating a testing procedure The testing

More information

How to use the Lafayette ESS Report to obtain a probability of deception or truth-telling

How to use the Lafayette ESS Report to obtain a probability of deception or truth-telling Lafayette Tech Talk: How to Use the Lafayette ESS Report to Obtain a Bayesian Conditional Probability of Deception or Truth-telling Raymond Nelson The Lafayette ESS Report is a useful tool for field polygraph

More information

Worksheet for Structured Review of Physical Exam or Diagnostic Test Study

Worksheet for Structured Review of Physical Exam or Diagnostic Test Study Worksheet for Structured Review of Physical Exam or Diagnostic Study Title of Manuscript: Authors of Manuscript: Journal and Citation: Identify and State the Hypothesis Primary Hypothesis: Secondary Hypothesis:

More information

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis

Applications. DSC 410/510 Multivariate Statistical Methods. Discriminating Two Groups. What is Discriminant Analysis DSC 4/5 Multivariate Statistical Methods Applications DSC 4/5 Multivariate Statistical Methods Discriminant Analysis Identify the group to which an object or case (e.g. person, firm, product) belongs:

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

7/17/2013. Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course

7/17/2013. Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course David W. Dowdy, MD, PhD Department of Epidemiology Johns Hopkins Bloomberg School of Public Health

More information

Lesson 9: Two Factor ANOVAS

Lesson 9: Two Factor ANOVAS Published on Agron 513 (https://courses.agron.iastate.edu/agron513) Home > Lesson 9 Lesson 9: Two Factor ANOVAS Developed by: Ron Mowers, Marin Harbur, and Ken Moore Completion Time: 1 week Introduction

More information

Bayes Theorem Application: Estimating Outcomes in Terms of Probability

Bayes Theorem Application: Estimating Outcomes in Terms of Probability Bayes Theorem Application: Estimating Outcomes in Terms of Probability The better the estimates, the better the outcomes. It s true in engineering and in just about everything else. Decisions and judgments

More information

Screening (Diagnostic Tests) Shaker Salarilak

Screening (Diagnostic Tests) Shaker Salarilak Screening (Diagnostic Tests) Shaker Salarilak Outline Screening basics Evaluation of screening programs Where we are? Definition of screening? Whether it is always beneficial? Types of bias in screening?

More information

Chapter 13 Estimating the Modified Odds Ratio

Chapter 13 Estimating the Modified Odds Ratio Chapter 13 Estimating the Modified Odds Ratio Modified odds ratio vis-à-vis modified mean difference To a large extent, this chapter replicates the content of Chapter 10 (Estimating the modified mean difference),

More information

Zheng Yao Sr. Statistical Programmer

Zheng Yao Sr. Statistical Programmer ROC CURVE ANALYSIS USING SAS Zheng Yao Sr. Statistical Programmer Outline Background Examples: Accuracy assessment Compare ROC curves Cut-off point selection Summary 2 Outline Background Examples: Accuracy

More information

Evidence-Based Medicine: Diagnostic study

Evidence-Based Medicine: Diagnostic study Evidence-Based Medicine: Diagnostic study What is Evidence-Based Medicine (EBM)? Expertise in integrating 1. Best research evidence 2. Clinical Circumstance 3. Patient values in clinical decisions Haynes,

More information

Sensitivity, Specificity, and Relatives

Sensitivity, Specificity, and Relatives Sensitivity, Specificity, and Relatives Brani Vidakovic ISyE 6421/ BMED 6700 Vidakovic, B. Se Sp and Relatives January 17, 2017 1 / 26 Overview Today: Vidakovic, B. Se Sp and Relatives January 17, 2017

More information

Assessment of performance and decision curve analysis

Assessment of performance and decision curve analysis Assessment of performance and decision curve analysis Ewout Steyerberg, Andrew Vickers Dept of Public Health, Erasmus MC, Rotterdam, the Netherlands Dept of Epidemiology and Biostatistics, Memorial Sloan-Kettering

More information

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES

11/18/2013. Correlational Research. Correlational Designs. Why Use a Correlational Design? CORRELATIONAL RESEARCH STUDIES Correlational Research Correlational Designs Correlational research is used to describe the relationship between two or more naturally occurring variables. Is age related to political conservativism? Are

More information

Statistics and Probability

Statistics and Probability Statistics and a single count or measurement variable. S.ID.1: Represent data with plots on the real number line (dot plots, histograms, and box plots). S.ID.2: Use statistics appropriate to the shape

More information

Using principal stratification to address post-randomization events: A case study. Baldur Magnusson, Advanced Exploratory Analytics PSI Webinar

Using principal stratification to address post-randomization events: A case study. Baldur Magnusson, Advanced Exploratory Analytics PSI Webinar Using principal stratification to address post-randomization events: A case study Baldur Magnusson, Advanced Exploratory Analytics PSI Webinar November 2, 2017 Outline Context Principal stratification

More information

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you? WDHS Curriculum Map Probability and Statistics Time Interval/ Unit 1: Introduction to Statistics 1.1-1.3 2 weeks S-IC-1: Understand statistics as a process for making inferences about population parameters

More information

Method Comparison Report Semi-Annual 1/5/2018

Method Comparison Report Semi-Annual 1/5/2018 Method Comparison Report Semi-Annual 1/5/2018 Prepared for Carl Commissioner Regularatory Commission 123 Commission Drive Anytown, XX, 12345 Prepared by Dr. Mark Mainstay Clinical Laboratory Kennett Community

More information

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data TECHNICAL REPORT Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data CONTENTS Executive Summary...1 Introduction...2 Overview of Data Analysis Concepts...2

More information

Sensitivity, specicity, ROC

Sensitivity, specicity, ROC Sensitivity, specicity, ROC Thomas Alexander Gerds Department of Biostatistics, University of Copenhagen 1 / 53 Epilog: disease prevalence The prevalence is the proportion of cases in the population today.

More information

SCATTER PLOTS AND TREND LINES

SCATTER PLOTS AND TREND LINES 1 SCATTER PLOTS AND TREND LINES LEARNING MAP INFORMATION STANDARDS 8.SP.1 Construct and interpret scatter s for measurement to investigate patterns of between two quantities. Describe patterns such as

More information

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm

Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Journal of Social and Development Sciences Vol. 4, No. 4, pp. 93-97, Apr 203 (ISSN 222-52) Bayesian Logistic Regression Modelling via Markov Chain Monte Carlo Algorithm Henry De-Graft Acquah University

More information

Introduction to Epidemiology Screening for diseases

Introduction to Epidemiology Screening for diseases Faculty of Medicine Introduction to Community Medicine Course (31505201) Unit 4 Epidemiology Introduction to Epidemiology Screening for diseases By Hatim Jaber MD MPH JBCM PhD 15 +17-11- 2016 1 Introduction

More information

! Parallels with clinical studies.! Two (of a number of) concerns about data from trials.! Concluding comments

! Parallels with clinical studies.! Two (of a number of) concerns about data from trials.! Concluding comments TRIAL-BASED ECONOMIC EVALUATIONS: CAN THEY STAND ALONE? Henry Glick Division of General Internal Medicine University of Pennsylvania www.uphs.upenn.edu/dgimhsr International Health Economics Association

More information

Statistical Methods and Reasoning for the Clinical Sciences

Statistical Methods and Reasoning for the Clinical Sciences Statistical Methods and Reasoning for the Clinical Sciences Evidence-Based Practice Eiki B. Satake, PhD Contents Preface Introduction to Evidence-Based Statistics: Philosophical Foundation and Preliminaries

More information

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC

Hypothesis Testing. Richard S. Balkin, Ph.D., LPC-S, NCC Hypothesis Testing Richard S. Balkin, Ph.D., LPC-S, NCC Overview When we have questions about the effect of a treatment or intervention or wish to compare groups, we use hypothesis testing Parametric statistics

More information

Data that can be classified as belonging to a distinct number of categories >>result in categorical responses. And this includes:

Data that can be classified as belonging to a distinct number of categories >>result in categorical responses. And this includes: This sheets starts from slide #83 to the end ofslide #4. If u read this sheet you don`t have to return back to the slides at all, they are included here. Categorical Data (Qualitative data): Data that

More information

An analysis of the use of animals in predicting human toxicology and drug safety: a review

An analysis of the use of animals in predicting human toxicology and drug safety: a review An analysis of the use of animals in predicting human toxicology and drug safety: a review Dr Elisabeth Harley Understanding Animal Research Summary Two papers published in the journal Alternatives to

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

Psychology, 2010, 1: doi: /psych Published Online August 2010 ( Psychology, 2010, 1: 194-198 doi:10.4236/psych.2010.13026 Published Online August 2010 (http://www.scirp.org/journal/psych) Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

Mantel-Haenszel Procedures for Detecting Differential Item Functioning

Mantel-Haenszel Procedures for Detecting Differential Item Functioning A Comparison of Logistic Regression and Mantel-Haenszel Procedures for Detecting Differential Item Functioning H. Jane Rogers, Teachers College, Columbia University Hariharan Swaminathan, University of

More information

Net Reclassification Risk: a graph to clarify the potential prognostic utility of new markers

Net Reclassification Risk: a graph to clarify the potential prognostic utility of new markers Net Reclassification Risk: a graph to clarify the potential prognostic utility of new markers Ewout Steyerberg Professor of Medical Decision Making Dept of Public Health, Erasmus MC Birmingham July, 2013

More information

Differential Item Functioning

Differential Item Functioning Differential Item Functioning Lecture #11 ICPSR Item Response Theory Workshop Lecture #11: 1of 62 Lecture Overview Detection of Differential Item Functioning (DIF) Distinguish Bias from DIF Test vs. Item

More information

Week 17 and 21 Comparing two assays and Measurement of Uncertainty Explain tools used to compare the performance of two assays, including

Week 17 and 21 Comparing two assays and Measurement of Uncertainty Explain tools used to compare the performance of two assays, including Week 17 and 21 Comparing two assays and Measurement of Uncertainty 2.4.1.4. Explain tools used to compare the performance of two assays, including 2.4.1.4.1. Linear regression 2.4.1.4.2. Bland-Altman plots

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Essential Skills for Evidence-based Practice: Statistics for Therapy Questions

Essential Skills for Evidence-based Practice: Statistics for Therapy Questions Essential Skills for Evidence-based Practice: Statistics for Therapy Questions Jeanne Grace Corresponding author: J. Grace E-mail: Jeanne_Grace@urmc.rochester.edu Jeanne Grace RN PhD Emeritus Clinical

More information

Hayden Smith, PhD, MPH /\ v._

Hayden Smith, PhD, MPH /\ v._ Hayden Smith, PhD, MPH.. + /\ v._ Information and clinical examples provided in presentation are strictly for educational purposes, and should not be substituted for clinical guidelines or up-to-date medical

More information

Chapter 10. Screening for Disease

Chapter 10. Screening for Disease Chapter 10 Screening for Disease 1 Terminology Reliability agreement of ratings/diagnoses, reproducibility Inter-rater reliability agreement between two independent raters Intra-rater reliability agreement

More information

Sampling Uncertainty / Sample Size for Cost-Effectiveness Analysis

Sampling Uncertainty / Sample Size for Cost-Effectiveness Analysis Sampling Uncertainty / Sample Size for Cost-Effectiveness Analysis Cost-Effectiveness Evaluation in Addiction Treatment Clinical Trials Henry Glick University of Pennsylvania www.uphs.upenn.edu/dgimhsr

More information

INTRODUCTION TO MACHINE LEARNING. Decision tree learning

INTRODUCTION TO MACHINE LEARNING. Decision tree learning INTRODUCTION TO MACHINE LEARNING Decision tree learning Task of classification Automatically assign class to observations with features Observation: vector of features, with a class Automatically assign

More information

Example - Birdkeeping and Lung Cancer - Interpretation. Lecture 20 - Sensitivity, Specificity, and Decisions. What do the numbers not mean...

Example - Birdkeeping and Lung Cancer - Interpretation. Lecture 20 - Sensitivity, Specificity, and Decisions. What do the numbers not mean... Odds Ratios Example - Birdkeeping and Lung Cancer - Interpretation Lecture 20 - Sensitivity, Specificity, and Decisions Sta102 / BME102 Colin Rundel April 16, 2014 Estimate Std. Error z value Pr(> z )

More information

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS

BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS BOOTSTRAPPING CONFIDENCE LEVELS FOR HYPOTHESES ABOUT REGRESSION MODELS 17 December 2009 Michael Wood University of Portsmouth Business School SBS Department, Richmond Building Portland Street, Portsmouth

More information

Clinical Decision Analysis

Clinical Decision Analysis Clinical Decision Analysis Terminology Sensitivity (Hit True Positive) Specificity (Correct rejection True Negative) Positive predictive value Negative predictive value The fraction of those with the disease

More information

Binary Diagnostic Tests Two Independent Samples

Binary Diagnostic Tests Two Independent Samples Chapter 537 Binary Diagnostic Tests Two Independent Samples Introduction An important task in diagnostic medicine is to measure the accuracy of two diagnostic tests. This can be done by comparing summary

More information

UNIT 5 - Association Causation, Effect Modification and Validity

UNIT 5 - Association Causation, Effect Modification and Validity 5 UNIT 5 - Association Causation, Effect Modification and Validity Introduction In Unit 1 we introduced the concept of causality in epidemiology and presented different ways in which causes can be understood

More information

Chapter 1: Introduction to Statistics

Chapter 1: Introduction to Statistics Chapter 1: Introduction to Statistics Variables A variable is a characteristic or condition that can change or take on different values. Most research begins with a general question about the relationship

More information

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when.

Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. INTRO TO RESEARCH METHODS: Empirical Knowledge: based on observations. Answer questions why, whom, how, and when. Experimental research: treatments are given for the purpose of research. Experimental group

More information

Placebo and Belief Effects: Optimal Design for Randomized Trials

Placebo and Belief Effects: Optimal Design for Randomized Trials Placebo and Belief Effects: Optimal Design for Randomized Trials Scott Ogawa & Ken Onishi 2 Department of Economics Northwestern University Abstract The mere possibility of receiving a placebo during a

More information

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests

Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests Empirical assessment of univariate and bivariate meta-analyses for comparing the accuracy of diagnostic tests Yemisi Takwoingi, Richard Riley and Jon Deeks Outline Rationale Methods Findings Summary Motivating

More information

Lecture 4: Research Approaches

Lecture 4: Research Approaches Lecture 4: Research Approaches Lecture Objectives Theories in research Research design approaches ú Experimental vs. non-experimental ú Cross-sectional and longitudinal ú Descriptive approaches How to

More information

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2

12/30/2017. PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2 PSY 5102: Advanced Statistics for Psychological and Behavioral Research 2 Selecting a statistical test Relationships among major statistical methods General Linear Model and multiple regression Special

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc.

Relationships. Between Measurements Variables. Chapter 10. Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc. Relationships Chapter 10 Between Measurements Variables Copyright 2005 Brooks/Cole, a division of Thomson Learning, Inc. Thought topics Price of diamonds against weight Male vs female age for dating Animals

More information

Review: Conditional Probability. Using tests to improve decisions: Cutting scores & base rates

Review: Conditional Probability. Using tests to improve decisions: Cutting scores & base rates Review: Conditional Probability Using tests to improve decisions: & base rates Conditional probabilities arise when the probability of one thing [A] depends on the probability of something else [B] In

More information

Week 2 Video 3. Diagnostic Metrics

Week 2 Video 3. Diagnostic Metrics Week 2 Video 3 Diagnostic Metrics Different Methods, Different Measures Today we ll continue our focus on classifiers Later this week we ll discuss regressors And other methods will get worked in later

More information