Introduction to ROC analysis

Size: px
Start display at page:

Download "Introduction to ROC analysis"

Transcription

1 Introduction to ROC analysis Andriy I. Bandos Department of Biostatistics University of Pittsburgh Acknowledgements Many thanks to Sam Wieand, Nancy Obuchowski, Brenda Kurland, and Todd Alonzo for previous version of this lecture

2 Outline 1. Basics of diagnostic accuracy evaluation 2. Why do we need ROC analysis? 3. How to construct and use the ROC curve * Focus on structure and interpretation of ROC tools, aside from statistical analysis

3 Basic (Classic) Set-up A number of subjects with and without the condition of interest (e.g., patients with/without distant metastases) For every subject, presence/absence of the condition of interest, or true status ( normal / abnormal ) is know from Gold (Reference) Standard diagnostic test result is obtained for every subject (e.g., radiologist s scores, 1-6, based on PET-CT interpretation) Goal of evaluation of diagnostic accuracy = assess the agreement between test result and true status Possible positive outcomes agreement is high overall = test can be used to diagnose presence or absence of the condition of interest agreement is high only for some results = test can be used to diagnose either presence or absence of the condition of interest

4 Illustrative Example FDG PET-CT identification of distant metastatic disease in uterine cervical and endometrial cancers: Analysis from ACRIN 6671/GOG0233. Gee et al., RSNA 2016 Goal: evaluate accuracy of staging PET-CT for detecting distant metastases (for the target population) Condition of Interest: presence/absence of distant metastatic disease Reference Standard: pathology and follow-up radiology reports Test result: radiologist interpretation of PET-CT for presence of distant metastases (scores on 1-6 scale with >3 being positive )

5 Simplest scenario: Binary test result (+/-) TRUE STATUS TEST RESULT Negative (-) Positive (+) Normal # True Negatives= 353 #False Positives=5 # Normal =358 Abnormal #False Negatives= #(T=1,D=0)=17 #True Positives=#(T=1,D=1)=31 # Abnormal =48 Most diagnostic tests are imperfect: errors are possible False Positive error: positive result for a normal subject False Negative error: negative result for an abnormal subject Two types of error have fundamentally different consequences (e.g., falsely suspecting versus missing the presence of distant metastases) must be kept separate #Negatives=370 #Positives=36 Total=406 Intuitively we would like to see more correct classifications, or, equivalently, fewer errors But how few is few enough?

6 Characterizing Diagnostic Accuracy TEST RESULT TRUE STATUS Negative (-) Positive (+) Normal # True Negatives= 353 #False Positives=5 # Normal =358 Abnormal #False Negatives=17 #True Positives=31 # Abnormal =48 How frequent are the correct classifications? 31 True Positives 31 out of Probability of True Positives 31 out of Sensitivity, Se, (or True Positive Fraction, TPF) 31 out of Positive Predictive Value (PPV) 353 True Negatives #Negatives=370 #Positives=36 Total= out of Probability of True Negatives 353 out of Specificity (True Negative Fraction) 353 out of Negative Predictive Value All measures can be useful under specific scenarios Se and Sp are usually used because they are typically robust (e.g., do not depend on prevalence) have fixed benchmarks of what is large (1) and what is small (0)

7 Graphical representation: ROC space ROC coordinates: Se (or TPF) as a vertical and 1-Sp (or FPF) as a horizontal axes Characteristics of benchmark tests: perfect (no errors): Sp=1, Se=1 most liberal (all labeled positive ): Sp=0, Se=1 most strict (all labeled negative ): guess (flip of a coin): Sp=1, Se=0 1-Sp=Se There is always a test with better Se or better Sp MUST consider both Se and Sp Simultaneous Interpretation of values of Se and Sp Bad comparable to performance of a guess (1-Sp Se, or close to the diagonal) Good close to the perfect (Se 1, Sp 1; or 1-Sp 0) A test with worse Se and Sp is objectively worse

8 More on Comparison of Diagnostic Tests Higher Se and Sp better test higher PPV and NPV in the same population (or better LR + and LR - ) better test higher PPV and lower NPV? E.g., PET-CT for distant metastases Central interpretations: Se=65%, Sp=99% PPV=86%, NPV=95% Institutional interpretations: Se=67%, Sp=94% PPV=62%, NPV=95% Often, this problem can be objectively solved by constructing ROC curves MarginProbe 2.0 PMA, FDA materials

9 ROC curve Many binary tests are actually dichotomized versions of continuous test (e.g., positive for distant metastases = (score>3) ) If we change diagnostic threshold /tune-up, both Se and Sp will change, e.g., TEST TRUTH (T>3) - + Normal Abnormal Se 0.65, Sp 0.99 (1-Sp 0.014) TEST TRUTH (T>2) - + Normal Abnormal Se 0.71, Sp=0.96 (1-Sp 0.039) ROC curve summarizes Se and 1-Sp for all thresholds ROC characterizes overall diagnostic accuracy applicable to a medical test as well as an artificial predictive tool (e.g., statistical model for binary truth )

10 ROC curve construction (make-up example) O X O O X X O X O X O X X O X O X O X O threshold c TRUTH TEST - + Normal Abnormal Se(c)=0.4 1-Sp(c)=0.1 O X O O X X O X O X O X X O X O X O X O TRUTH TEST - + Normal Abnormal Se(c)=0.7 1-Sp(c)=0.3 O X O O X X O X O X O X X O X O X O X O TRUTH TEST - + Normal Abnormal Se(c)=0.9 1-Sp(c)=0.6

11 Use of ROC : threshold bias removal ROC describes all Se-Sp values which we can obtain by changing the diagnostic One of classic uses of the ROC curve: Can a new test achieve higher Se and Sp than for the existing (conventional) test? More informative comparison is usually based on comparison of the two ROC curves (more on that later) Gee, et al., RSNA, 2016

12 Use of ROC: comparison of tests Sometimes one diagnostic test is clearly better than the other Zuley ML, et al, Radiology 2013 Gee M, et al., RSNA 2016 AUCs: 0.84 vs 0.89 Sometimes, the improvement is not uniform, i.e., ROC curve is higher in some ranges and lower in others The practical/clinical purpose or objectives of the study should then drive the decision as to which test is better for the considered application

13 Use of ROC: evaluating a diagnostic test Benchmarks ROC curves Perfect: two segments connecting at (Sp=1, Sp=1) Guessing: diagonal (random choice) 1-Sp(c)=Se(c) Tests with imperfect ROC curves are still useful Tests with high ROC curves often lead to operating points with good Se and Sp (i.e., useful for both types of decisions) Tests with relatively low ROC curves could be useful for targeted decisions, e.g. for identifying a subset of diseased : Se>>0, Sp 1 for identifying a subset of disease-free : Se 1, Sp>>0 It is not always straightforward to quantify how good the ROC curve is Multiple summary indices are available (will discuss later)

14 Most typical ROC summary index Area Under the ROC curve (AUC) Single-value summary index of the entire ROC curve perfect ROC AUC=1; guessing ROC AUC=0.5 Quantifies the difference between distributions of test results for diseased and non-diseased (~ well-known Wilcoxon statistical test for comparing two groups) Interpretation: randomly selected diseased subject tests more suspicious than randomly selected non-diseased Technical advantages: well-known, objective, easy to use Limitations not very clinically relevant can be misleading (as any scalar index for the entire curve), e.g., non-guessing ROC with AUC=0.5 summarizes over the operating points outside of practical interest (e.g. Sp< 0.5)

15 Summary Indices for the ROC curve Area under the ROC curve (AUC) objective, i.e. does not require subjective specifications summarizes over the operating points outside of practical interest (e.g. Sp< 0.5) one of more precise summary indices Partial AUC (pauc) for s 1 <Sp<s 2 perfect pauc= s 1 -s 2 ; guess pauc= s 1 -s 2 (s 1 +s 2 )/2 focuses on the operating points in the ranges of interest uses more than one point from the curve (typically is less precise than AUC) Sensitivity corresponding to a given specificity perfect Se (Sp=s)=1; guess Se (Sp=s)=1-s too subjective due to required knowledge of needed Sp relatively imprecise (typically requires larger sample sizes)

16 Problems with scalar ROC indices Scalar indices lose some information stored in the ROC curve different indices could contradict to each other, e.g.: ROC curves with the same AUCs can be different at almost all points ROC curve with higher overall AUC can be lower in the range of interest (e.g. Sp>0.9) Thus, it is very important to look at the ROC curve in addition to computing summary indices

17 ROCs and binary tests When binary test is clearly better than the other Does it mean that it will also have a better ROC? Yes, at least between two known test point (because ROC curve is always non-decreasing) but not necessarily for all thresholds When a binary test has better PPV and NPV (in the same sample) Does it also have better ROC curve in a range two test points? Yes, if a test can be assumed reasonable (or locally better than chance ), e.g., probability of malignancy score of a trained radiologist) but this is not necessarily true for non-optimized tests e.g. raw biomarker measured for tissue A test with lower Sensitivity and Specificity

18 Limitations Typical limitations An entire curve can be difficult to interpret A single-number (scalar) indices of the ROC curve can be misleading ROC curves for human observers might be difficult to interpret (it is good to support ROC-based findings by the accuracy of actual decisions) Do you always need to use the ROC curve? No, e.g., in some cases, consideration of a single point (Sp, Se) can provide sufficient information Gee, et al., RSNA, 2016 Does ROC analysis always provide a definite answer? No, e.g., the ROC curves that cross in the region of interest

19 Overall recommendations If a reliable estimate of the ROC curve is available use ROC curve to visually evaluate or compare diagnostic systems use appropriate summary indices to quantify the results If reliable estimate of the ROC curve is not available, but the binary test results ( positive / negative ) are known: Try using a pair of intrinsic characteristics (Se, Sp, or functions thereof) to infer on diagnostic accuracy If only prevalence-dependent characteristics are available (e.g, PV s) remember the dependence on the prevalence in the sample When using a scalar summary index remember that no scalar summary index is better than others under all circumstances; study objectives should drive the interpret numeric value based on the values attained by performance of the benchmarks tests (perfect, guessing) under the same conditions

20 ROC analysis Many, more complicated, ROC techniques exist, e.g., Multiple estimation approaches: parametric (e.g., binormal ROC), non-parametric (empirical), semi-parametric Incorporation of information from other covariates: modeling ROC curve or its indices (e.g., ROC-GLM) Application to time-to-event data time-dependent ROC Extensions to more than two classes: multi-class ROC analysis Extensions to multiple targets per subject: free-response ROC (FROC), regions of interest approach (ROI).. Most standard software packages offer some basic types of ROC analysis (SPSS, SAS, Stata, R,.) Use of many advanced ROC techniques require help of a statistician experienced in the area A couple of great textbooks on ROC analysis and related topics Zhou, X.H., Obuchowski, N.A., McClish D.K. (2011). Statistical methods in diagnostic medicine. 2nd edition. New York: Wiley & Sons Inc. Pepe, M.S. (2003). The statistical evaluation of medical test for classification and prediction. Oxford: Oxford University Press.

21 THANK YOU!

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN Outline 1. Review sensitivity and specificity 2. Define an ROC curve 3. Define AUC 4. Non-parametric tests for whether or not the test is informative 5. Introduce the binormal ROC model 6. Discuss non-parametric

More information

Critical reading of diagnostic imaging studies. Lecture Goals. Constantine Gatsonis, PhD. Brown University

Critical reading of diagnostic imaging studies. Lecture Goals. Constantine Gatsonis, PhD. Brown University Critical reading of diagnostic imaging studies Constantine Gatsonis Center for Statistical Sciences Brown University Annual Meeting Lecture Goals 1. Review diagnostic imaging evaluation goals and endpoints.

More information

Bayesian Methods for Medical Test Accuracy. Broemeling & Associates Inc., 1023 Fox Ridge Road, Medical Lake, WA 99022, USA;

Bayesian Methods for Medical Test Accuracy. Broemeling & Associates Inc., 1023 Fox Ridge Road, Medical Lake, WA 99022, USA; Diagnostics 2011, 1, 1-35; doi:10.3390/diagnostics1010001 OPEN ACCESS diagnostics ISSN 2075-4418 www.mdpi.com/journal/diagnostics/ Review Bayesian Methods for Medical Test Accuracy Lyle D. Broemeling Broemeling

More information

OBSERVER PERFORMANCE METHODS FOR DIAGNOSTIC IMAGING Foundations, Modeling and Applications with R-Examples. Dev P.

OBSERVER PERFORMANCE METHODS FOR DIAGNOSTIC IMAGING Foundations, Modeling and Applications with R-Examples. Dev P. OBSERVER PERFORMANCE METHODS FOR DIAGNOSTIC IMAGING Foundations, Modeling and Applications with R-Examples Dev P. Chakraborty, PhD Chapter 01: Preliminaries This chapter provides background material, starting

More information

Introduction to screening tests. Tim Hanson Department of Statistics University of South Carolina April, 2011

Introduction to screening tests. Tim Hanson Department of Statistics University of South Carolina April, 2011 Introduction to screening tests Tim Hanson Department of Statistics University of South Carolina April, 2011 1 Overview: 1. Estimating test accuracy: dichotomous tests. 2. Estimating test accuracy: continuous

More information

METHODS FOR DETECTING CERVICAL CANCER

METHODS FOR DETECTING CERVICAL CANCER Chapter III METHODS FOR DETECTING CERVICAL CANCER 3.1 INTRODUCTION The successful detection of cervical cancer in a variety of tissues has been reported by many researchers and baseline figures for the

More information

Sample Size Considerations. Todd Alonzo, PhD

Sample Size Considerations. Todd Alonzo, PhD Sample Size Considerations Todd Alonzo, PhD 1 Thanks to Nancy Obuchowski for the original version of this presentation. 2 Why do Sample Size Calculations? 1. To minimize the risk of making the wrong conclusion

More information

Module Overview. What is a Marker? Part 1 Overview

Module Overview. What is a Marker? Part 1 Overview SISCR Module 7 Part I: Introduction Basic Concepts for Binary Classification Tools and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

More information

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

More information

Receiver operating characteristic

Receiver operating characteristic Receiver operating characteristic From Wikipedia, the free encyclopedia In signal detection theory, a receiver operating characteristic (ROC), or simply ROC curve, is a graphical plot of the sensitivity,

More information

Week 2 Video 3. Diagnostic Metrics

Week 2 Video 3. Diagnostic Metrics Week 2 Video 3 Diagnostic Metrics Different Methods, Different Measures Today we ll continue our focus on classifiers Later this week we ll discuss regressors And other methods will get worked in later

More information

Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests

Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests Reducing Decision Errors in the Paired Comparison of the Diagnostic Accuracy of Continuous Screening Tests Brandy M. Ringham, 1 Todd A. Alonzo, 2 John T. Brinton, 1 Aarti Munjal, 1 Keith E. Muller, 3 Deborah

More information

Sensitivity, Specificity, and Relatives

Sensitivity, Specificity, and Relatives Sensitivity, Specificity, and Relatives Brani Vidakovic ISyE 6421/ BMED 6700 Vidakovic, B. Se Sp and Relatives January 17, 2017 1 / 26 Overview Today: Vidakovic, B. Se Sp and Relatives January 17, 2017

More information

Diagnostic screening. Department of Statistics, University of South Carolina. Stat 506: Introduction to Experimental Design

Diagnostic screening. Department of Statistics, University of South Carolina. Stat 506: Introduction to Experimental Design Diagnostic screening Department of Statistics, University of South Carolina Stat 506: Introduction to Experimental Design 1 / 27 Ties together several things we ve discussed already... The consideration

More information

Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA.

Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA. A More Intuitive Interpretation of the Area Under the ROC Curve A. Cecile J.W. Janssens, PhD Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA. Corresponding

More information

COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE

COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE DOI: 105281/zenodo47521 Impact Factor (PIF): 2672 COMPARING SEVERAL DIAGNOSTIC PROCEDURES USING THE INTRINSIC MEASURES OF ROC CURVE Vishnu Vardhan R* and Balaswamy S * Department of Statistics, Pondicherry

More information

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp The Stata Journal (22) 2, Number 3, pp. 28 289 Comparative assessment of three common algorithms for estimating the variance of the area under the nonparametric receiver operating characteristic curve

More information

Mammography limitations. Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study

Mammography limitations. Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study Clinical performance of digital breast tomosynthesis compared to digital mammography: blinded multi-reader study G. Gennaro (1), A. Toledano (2), E. Baldan (1), E. Bezzon (1), C. di Maggio (1), M. La Grassa

More information

Introduction to Meta-analysis of Accuracy Data

Introduction to Meta-analysis of Accuracy Data Introduction to Meta-analysis of Accuracy Data Hans Reitsma MD, PhD Dept. of Clinical Epidemiology, Biostatistics & Bioinformatics Academic Medical Center - Amsterdam Continental European Support Unit

More information

Sensitivity, specicity, ROC

Sensitivity, specicity, ROC Sensitivity, specicity, ROC Thomas Alexander Gerds Department of Biostatistics, University of Copenhagen 1 / 53 Epilog: disease prevalence The prevalence is the proportion of cases in the population today.

More information

Women s Imaging Original Research

Women s Imaging Original Research Women s Imaging Original Research Women s Imaging Original Research David Gur 1 Andriy I. Bandos 2 Howard E. Rockette 2 Margarita L. Zuley 3 Jules H. Sumkin 3 Denise M. Chough 3 Christiane M. Hakim 3 Gur

More information

USE OF AREA UNDER THE CURVE (AUC) FROM PROPENSITY MODEL TO ESTIMATE ACCURACY OF THE ESTIMATED EFFECT OF EXPOSURE

USE OF AREA UNDER THE CURVE (AUC) FROM PROPENSITY MODEL TO ESTIMATE ACCURACY OF THE ESTIMATED EFFECT OF EXPOSURE USE OF AREA UNDER THE CURVE (AUC) FROM PROPENSITY MODEL TO ESTIMATE ACCURACY OF THE ESTIMATED EFFECT OF EXPOSURE by Zhijiang Zhang B. Med, Nanjing Medical University, China, 1998 M.Sc, Harvard University,

More information

Evidence-Based Medicine: Diagnostic study

Evidence-Based Medicine: Diagnostic study Evidence-Based Medicine: Diagnostic study What is Evidence-Based Medicine (EBM)? Expertise in integrating 1. Best research evidence 2. Clinical Circumstance 3. Patient values in clinical decisions Haynes,

More information

7/17/2013. Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course

7/17/2013. Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course Evaluation of Diagnostic Tests July 22, 2013 Introduction to Clinical Research: A Two week Intensive Course David W. Dowdy, MD, PhD Department of Epidemiology Johns Hopkins Bloomberg School of Public Health

More information

4. Model evaluation & selection

4. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2017 4. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

A scored AUC Metric for Classifier Evaluation and Selection

A scored AUC Metric for Classifier Evaluation and Selection A scored AUC Metric for Classifier Evaluation and Selection Shaomin Wu SHAOMIN.WU@READING.AC.UK School of Construction Management and Engineering, The University of Reading, Reading RG6 6AW, UK Peter Flach

More information

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests

A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests Baylor Health Care System From the SelectedWorks of unlei Cheng 1 A Bayesian approach to sample size determination for studies designed to evaluate continuous medical tests unlei Cheng, Baylor Health Care

More information

Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy

Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Chapter 10 Analysing and Presenting Results Petra Macaskill, Constantine Gatsonis, Jonathan Deeks, Roger Harbord, Yemisi Takwoingi.

More information

Designing Studies of Diagnostic Imaging

Designing Studies of Diagnostic Imaging Designing Studies of Diagnostic Imaging Chaya S. Moskowitz, PhD With thanks to Nancy Obuchowski Outline What is study design? Building blocks of imaging studies Strategies to improve study efficiency What

More information

Various performance measures in Binary classification An Overview of ROC study

Various performance measures in Binary classification An Overview of ROC study Various performance measures in Binary classification An Overview of ROC study Suresh Babu. Nellore Department of Statistics, S.V. University, Tirupati, India E-mail: sureshbabu.nellore@gmail.com Abstract

More information

VU Biostatistics and Experimental Design PLA.216

VU Biostatistics and Experimental Design PLA.216 VU Biostatistics and Experimental Design PLA.216 Julia Feichtinger Postdoctoral Researcher Institute of Computational Biotechnology Graz University of Technology Outline for Today About this course Background

More information

Evaluation of diagnostic tests

Evaluation of diagnostic tests Evaluation of diagnostic tests Biostatistics and informatics Miklós Kellermayer Overlapping distributions Assumption: A classifier value (e.g., diagnostic parameter, a measurable quantity, e.g., serum

More information

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions XI Biennial Conference of the International Biometric Society (Indian Region) on Computational Statistics and Bio-Sciences, March 8-9, 22 43 Estimation of Area under the ROC Curve Using Exponential and

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE NAHID SULTANA SUMI, M. ATAHARUL ISLAM, AND MD. AKHTAR HOSSAIN Abstract. Methods of evaluating and comparing the performance of diagnostic

More information

Modifying ROC Curves to Incorporate Predicted Probabilities

Modifying ROC Curves to Incorporate Predicted Probabilities Modifying ROC Curves to Incorporate Predicted Probabilities C. Ferri, P. Flach 2, J. Hernández-Orallo, A. Senad Departament de Sistemes Informàtics i Computació Universitat Politècnica de València Spain

More information

Systematic Reviews and meta-analyses of Diagnostic Test Accuracy. Mariska Leeflang

Systematic Reviews and meta-analyses of Diagnostic Test Accuracy. Mariska Leeflang Systematic Reviews and meta-analyses of Diagnostic Test Accuracy Mariska Leeflang m.m.leeflang@amc.uva.nl This presentation 1. Introduction: accuracy? 2. QUADAS-2 exercise 3. Meta-analysis of diagnostic

More information

Combining Predictors for Classification Using the Area Under the ROC Curve

Combining Predictors for Classification Using the Area Under the ROC Curve UW Biostatistics Working Paper Series 6-7-2004 Combining Predictors for Classification Using the Area Under the ROC Curve Margaret S. Pepe University of Washington, mspepe@u.washington.edu Tianxi Cai Harvard

More information

ROC (Receiver Operating Characteristic) Curve Analysis

ROC (Receiver Operating Characteristic) Curve Analysis ROC (Receiver Operating Characteristic) Curve Analysis Julie Xu 17 th November 2017 Agenda Introduction Definition Accuracy Application Conclusion Reference 2017 All Rights Reserved Confidential for INC

More information

(true) Disease Condition Test + Total + a. a + b True Positive False Positive c. c + d False Negative True Negative Total a + c b + d a + b + c + d

(true) Disease Condition Test + Total + a. a + b True Positive False Positive c. c + d False Negative True Negative Total a + c b + d a + b + c + d Biostatistics and Research Design in Dentistry Reading Assignment Measuring the accuracy of diagnostic procedures and Using sensitivity and specificity to revise probabilities, in Chapter 12 of Dawson

More information

Validity of the Sinhala Version of the General Health Questionnaires Item 12 and 30: Using Different Sampling Strategies and Scoring Methods

Validity of the Sinhala Version of the General Health Questionnaires Item 12 and 30: Using Different Sampling Strategies and Scoring Methods Original Research Article Validity of the Sinhala Version of the General Health Questionnaires Item 12 and 30: Using Different Sampling Strategies and Scoring Methods H T C S Abeysena 1, P L Jayawardana

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T. Diagnostic Tests 1 Introduction Suppose we have a quantitative measurement X i on experimental or observed units i = 1,..., n, and a characteristic Y i = 0 or Y i = 1 (e.g. case/control status). The measurement

More information

About OMICS International

About OMICS International About OMICS International OMICS International through its Open Access Initiative is committed to make genuine and reliable contributions to the scientific community. OMICS International hosts over 700

More information

Selection and Combination of Markers for Prediction

Selection and Combination of Markers for Prediction Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe

More information

4 Diagnostic Tests and Measures of Agreement

4 Diagnostic Tests and Measures of Agreement 4 Diagnostic Tests and Measures of Agreement Diagnostic tests may be used for diagnosis of disease or for screening purposes. Some tests are more effective than others, so we need to be able to measure

More information

Diffuse high-attenuation within mediastinal lymph nodes on non-enhanced CT scan: Usefulness in the prediction of benignancy

Diffuse high-attenuation within mediastinal lymph nodes on non-enhanced CT scan: Usefulness in the prediction of benignancy Diffuse high-attenuation within mediastinal lymph nodes on non-enhanced CT scan: Usefulness in the prediction of benignancy Poster No.: C-1785 Congress: ECR 2012 Type: Authors: Keywords: DOI: Scientific

More information

Challenges of Observational and Retrospective Studies

Challenges of Observational and Retrospective Studies Challenges of Observational and Retrospective Studies Kyoungmi Kim, Ph.D. March 8, 2017 This seminar is jointly supported by the following NIH-funded centers: Background There are several methods in which

More information

NOVEL BIOMARKERS PERSPECTIVES ON DISCOVERY AND DEVELOPMENT. Winton Gibbons Consulting

NOVEL BIOMARKERS PERSPECTIVES ON DISCOVERY AND DEVELOPMENT. Winton Gibbons Consulting NOVEL BIOMARKERS PERSPECTIVES ON DISCOVERY AND DEVELOPMENT Winton Gibbons Consulting www.wintongibbons.com NOVEL BIOMARKERS PERSPECTIVES ON DISCOVERY AND DEVELOPMENT This whitepaper is decidedly focused

More information

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India.

Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra, India. Research Methodology Sample size and power analysis in medical research Sanjay P. Zodpey Clinical Epidemiology Unit, Department of Preventive and Social Medicine, Government Medical College, Nagpur, Maharashtra,

More information

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes Knowledge Discovery and Data Mining Lecture 15 - ROC, AUC & Lift Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-17-AUC

More information

Statistics, Probability and Diagnostic Medicine

Statistics, Probability and Diagnostic Medicine Statistics, Probability and Diagnostic Medicine Jennifer Le-Rademacher, PhD Sponsored by the Clinical and Translational Science Institute (CTSI) and the Department of Population Health / Division of Biostatistics

More information

Behavioral Data Mining. Lecture 4 Measurement

Behavioral Data Mining. Lecture 4 Measurement Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for

More information

Introductory Lecture. Significance of results in cancer imaging

Introductory Lecture. Significance of results in cancer imaging Cancer Imaging (2001) 2, 1 5 Introductory Lecture Monday 15 October 2001, 09.10 09.50 Significance of results in cancer imaging Rodney H Reznek Professor of Diagnostic Imaging, St Bartholomew s Hospital,

More information

A novel approach to assess diagnostic test and observer agreement for ordinal data. Zheng Zhang Emory University Atlanta, GA 30322

A novel approach to assess diagnostic test and observer agreement for ordinal data. Zheng Zhang Emory University Atlanta, GA 30322 A novel approach to assess diagnostic test and observer agreement for ordinal data Zheng Zhang Emory University Atlanta, GA 30322 Corresponding address: Zheng Zhang, Ph. D. Department of Biostatistics

More information

Diagnostic tests, Laboratory tests

Diagnostic tests, Laboratory tests Diagnostic tests, Laboratory tests I. Introduction II. III. IV. Informational values of a test Consequences of the prevalence rate Sequential use of 2 tests V. Selection of a threshold: the ROC curve VI.

More information

ROC Curve. Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341)

ROC Curve. Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) ROC Curve Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342 ROC Curve The ROC Curve procedure provides a useful way to evaluate the performance of classification

More information

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015 Introduction to diagnostic accuracy meta-analysis Yemisi Takwoingi October 2015 Learning objectives To appreciate the concept underlying DTA meta-analytic approaches To know the Moses-Littenberg SROC method

More information

Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic, Prognostic, or Screening Marker

Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic, Prognostic, or Screening Marker American Journal of Epidemiology Copyright 2004 by the Johns Hopkins Bloomberg School of Public Health All rights reserved Vol. 159, No. 9 Printed in U.S.A. DOI: 10.1093/aje/kwh101 Limitations of the Odds

More information

ROC Curves (Old Version)

ROC Curves (Old Version) Chapter 545 ROC Curves (Old Version) Introduction This procedure generates both binormal and empirical (nonparametric) ROC curves. It computes comparative measures such as the whole, and partial, area

More information

Discrimination and Reclassification in Statistics and Study Design AACC/ASN 30 th Beckman Conference

Discrimination and Reclassification in Statistics and Study Design AACC/ASN 30 th Beckman Conference Discrimination and Reclassification in Statistics and Study Design AACC/ASN 30 th Beckman Conference Michael J. Pencina, PhD Duke Clinical Research Institute Duke University Department of Biostatistics

More information

Fundamentals of Clinical Research for Radiologists. ROC Analysis. - Research Obuchowski ROC Analysis. Nancy A. Obuchowski 1

Fundamentals of Clinical Research for Radiologists. ROC Analysis. - Research Obuchowski ROC Analysis. Nancy A. Obuchowski 1 - Research Nancy A. 1 Received October 28, 2004; accepted after revision November 3, 2004. Series editors: Nancy, C. Craig Blackmore, Steven Karlik, and Caroline Reinhold. This is the 14th in the series

More information

Screening (Diagnostic Tests) Shaker Salarilak

Screening (Diagnostic Tests) Shaker Salarilak Screening (Diagnostic Tests) Shaker Salarilak Outline Screening basics Evaluation of screening programs Where we are? Definition of screening? Whether it is always beneficial? Types of bias in screening?

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic or Prognostic Marker

Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic or Prognostic Marker UW Biostatistics Working Paper Series 1-7-2005 Limitations of the Odds Ratio in Gauging the Performance of a Diagnostic or Prognostic Marker Margaret S. Pepe University of Washington, mspepe@u.washington.edu

More information

2011 ASCP Annual Meeting

2011 ASCP Annual Meeting Diagnostic Accuracy Martin Kroll, MD Professor of Pathology and Laboratory Medicine Boston University School of Medicine Chief, Laboratory Medicine Boston Medical Center Disclosure Roche Abbott Course

More information

The solitary pulmonary nodule: Assessing the success of predicting malignancy

The solitary pulmonary nodule: Assessing the success of predicting malignancy The solitary pulmonary nodule: Assessing the success of predicting malignancy Poster No.: C-0829 Congress: ECR 2010 Type: Scientific Exhibit Topic: Chest Authors: R. W. K. Lindsay, J. Foster, K. McManus;

More information

Underuse of risk assessment and overuse of CTPA in patients with suspected pulmonary thromboembolism

Underuse of risk assessment and overuse of CTPA in patients with suspected pulmonary thromboembolism Underuse of risk assessment and overuse of CTPA in patients with suspected pulmonary thromboembolism Michael Perera Advanced Trainee in General and Acute Medicine Leena Aggarwal Director, Medical Assessment

More information

1 Diagnostic Test Evaluation

1 Diagnostic Test Evaluation 1 Diagnostic Test Evaluation The Receiver Operating Characteristic (ROC) curve of a diagnostic test is a plot of test sensitivity (the probability of a true positive) against 1.0 minus test specificity

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Detection of Mild Cognitive Impairment using Image Differences and Clinical Features

Detection of Mild Cognitive Impairment using Image Differences and Clinical Features Detection of Mild Cognitive Impairment using Image Differences and Clinical Features L I N L I S C H O O L O F C O M P U T I N G C L E M S O N U N I V E R S I T Y Copyright notice Many of the images in

More information

Study Designs. Lecture 3. Kevin Frick

Study Designs. Lecture 3. Kevin Frick This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this

More information

Assessing the accuracy of response propensities in longitudinal studies

Assessing the accuracy of response propensities in longitudinal studies Assessing the accuracy of response propensities in longitudinal studies CCSR Working Paper 2010-08 Ian Plewis, Sosthenes Ketende, Lisa Calderwood Ian Plewis, Social Statistics, University of Manchester,

More information

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

Psychology, 2010, 1: doi: /psych Published Online August 2010 ( Psychology, 2010, 1: 194-198 doi:10.4236/psych.2010.13026 Published Online August 2010 (http://www.scirp.org/journal/psych) Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes

More information

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference

Lecture Outline Biost 517 Applied Biostatistics I. Statistical Goals of Studies Role of Statistical Inference Lecture Outline Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Statistical Inference Role of Statistical Inference Hierarchy of Experimental

More information

Lecture 2. Key Concepts in Clinical Research

Lecture 2. Key Concepts in Clinical Research Lecture 2 Key Concepts in Clinical Research Outline Key Statistical Concepts Bias and Variability Type I Error and Power Confounding and Interaction Statistical Difference vs Clinical Difference One-sided

More information

Methods of MR Fat Quantification and their Pros and Cons

Methods of MR Fat Quantification and their Pros and Cons Methods of MR Fat Quantification and their Pros and Cons Michael Middleton, MD PhD UCSD Department of Radiology International Workshop on NASH Biomarkers Washington, DC April 29-30, 2016 1 Disclosures

More information

SEED HAEMATOLOGY. Medical statistics your support when interpreting results SYSMEX EDUCATIONAL ENHANCEMENT AND DEVELOPMENT APRIL 2015

SEED HAEMATOLOGY. Medical statistics your support when interpreting results SYSMEX EDUCATIONAL ENHANCEMENT AND DEVELOPMENT APRIL 2015 SYSMEX EDUCATIONAL ENHANCEMENT AND DEVELOPMENT APRIL 2015 SEED HAEMATOLOGY Medical statistics your support when interpreting results The importance of statistical investigations Modern medicine is often

More information

Diagnostic methods 2: receiver operating characteristic (ROC) curves

Diagnostic methods 2: receiver operating characteristic (ROC) curves abc of epidemiology http://www.kidney-international.org & 29 International Society of Nephrology Diagnostic methods 2: receiver operating characteristic (ROC) curves Giovanni Tripepi 1, Kitty J. Jager

More information

The recommended method for diagnosing sleep

The recommended method for diagnosing sleep reviews Measuring Agreement Between Diagnostic Devices* W. Ward Flemons, MD; and Michael R. Littner, MD, FCCP There is growing interest in using portable monitoring for investigating patients with suspected

More information

Evidence-based Imaging: Critically Appraising Studies of Diagnostic Tests

Evidence-based Imaging: Critically Appraising Studies of Diagnostic Tests Evidence-based Imaging: Critically Appraising Studies of Diagnostic Tests Aine Marie Kelly, MD Critically Appraising Studies of Diagnostic Tests Aine Marie Kelly B.A., M.B. B.Ch. B.A.O., M.S. M.R.C.P.I.,

More information

STATS8: Introduction to Biostatistics. Overview. Babak Shahbaba Department of Statistics, UCI

STATS8: Introduction to Biostatistics. Overview. Babak Shahbaba Department of Statistics, UCI STATS8: Introduction to Biostatistics Overview Babak Shahbaba Department of Statistics, UCI The role of statistical analysis in science This course discusses some biostatistical methods, which involve

More information

Assessment of performance and decision curve analysis

Assessment of performance and decision curve analysis Assessment of performance and decision curve analysis Ewout Steyerberg, Andrew Vickers Dept of Public Health, Erasmus MC, Rotterdam, the Netherlands Dept of Epidemiology and Biostatistics, Memorial Sloan-Kettering

More information

3. Model evaluation & selection

3. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2016 3. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH Instructor: Chap T. Le, Ph.D. Distinguished Professor of Biostatistics Basic Issues: COURSE INTRODUCTION BIOSTATISTICS BIOSTATISTICS is the Biomedical

More information

Chapter 5: Field experimental designs in agriculture

Chapter 5: Field experimental designs in agriculture Chapter 5: Field experimental designs in agriculture Jose Crossa Biometrics and Statistics Unit Crop Research Informatics Lab (CRIL) CIMMYT. Int. Apdo. Postal 6-641, 06600 Mexico, DF, Mexico Introduction

More information

Diagnostic Test. H. Risanto Siswosudarmo Department of Obstetrics and Gynecology Faculty of Medicine, UGM Jogjakarta. RS Sardjito

Diagnostic Test. H. Risanto Siswosudarmo Department of Obstetrics and Gynecology Faculty of Medicine, UGM Jogjakarta. RS Sardjito ب س م الل ه الر ح م ن الر ح يم RS Sardjito Diagnostic Test Gold standard New (test Disease No Disease Column Total Posi*ve a b a+b Nega*ve c d c+d Row Total a+c b+d N H. Risanto Siswosudarmo Department

More information

Evidence Based Medicine Prof P Rheeder Clinical Epidemiology. Module 2: Applying EBM to Diagnosis

Evidence Based Medicine Prof P Rheeder Clinical Epidemiology. Module 2: Applying EBM to Diagnosis Evidence Based Medicine Prof P Rheeder Clinical Epidemiology Module 2: Applying EBM to Diagnosis Content 1. Phases of diagnostic research 2. Developing a new test for lung cancer 3. Thresholds 4. Critical

More information

Blinded, independent, central image review in oncology trials: how the study endpoint impacts the adjudication rate

Blinded, independent, central image review in oncology trials: how the study endpoint impacts the adjudication rate Blinded, independent, central image review in oncology trials: how the study endpoint impacts the adjudication rate Poster No.: C-0200 Congress: ECR 2014 Type: Authors: Keywords: DOI: Scientific Exhibit

More information

11 questions to help you evaluate a clinical prediction rule

11 questions to help you evaluate a clinical prediction rule 11 questions to help you evaluate a clinical prediction rule How to use this appraisal tool Three broad issues need to be considered when appraising any study: Are the results of the study valid? (Section

More information

10 Intraclass Correlations under the Mixed Factorial Design

10 Intraclass Correlations under the Mixed Factorial Design CHAPTER 1 Intraclass Correlations under the Mixed Factorial Design OBJECTIVE This chapter aims at presenting methods for analyzing intraclass correlation coefficients for reliability studies based on a

More information

Meta-analysis of Diagnostic Test Accuracy Studies

Meta-analysis of Diagnostic Test Accuracy Studies GUIDELINE Meta-analysis of Diagnostic Test Accuracy Studies November 2014 Copyright EUnetHTA 2013. All Rights Reserved. No part of this document may be reproduced without an explicit acknowledgement of

More information

Prevent Cancer Foundation Quantitative Imaging Workshop XIII

Prevent Cancer Foundation Quantitative Imaging Workshop XIII Status of the Quantitative Imaging Profile Lung Nodule Volume Assessment and Monitoring in Low Dose CT Screening Prevent Cancer Foundation Quantitative Imaging Workshop XIII June 13-14, 2016 David S. Gierada,

More information

INTRODUCTION TO MACHINE LEARNING. Decision tree learning

INTRODUCTION TO MACHINE LEARNING. Decision tree learning INTRODUCTION TO MACHINE LEARNING Decision tree learning Task of classification Automatically assign class to observations with features Observation: vector of features, with a class Automatically assign

More information

Comparing Two ROC Curves Independent Groups Design

Comparing Two ROC Curves Independent Groups Design Chapter 548 Comparing Two ROC Curves Independent Groups Design Introduction This procedure is used to compare two ROC curves generated from data from two independent groups. In addition to producing a

More information

RISK PREDICTION MODEL: PENALIZED REGRESSIONS

RISK PREDICTION MODEL: PENALIZED REGRESSIONS RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver Guttmann,

More information

Lecture 1 An introduction to statistics in Ichthyology and Fisheries Science

Lecture 1 An introduction to statistics in Ichthyology and Fisheries Science Lecture 1 An introduction to statistics in Ichthyology and Fisheries Science What is statistics and why do we need it? Statistics attempts to make inferences about unknown values that are common to a population

More information

3 CONCEPTUAL FOUNDATIONS OF STATISTICS

3 CONCEPTUAL FOUNDATIONS OF STATISTICS 3 CONCEPTUAL FOUNDATIONS OF STATISTICS In this chapter, we examine the conceptual foundations of statistics. The goal is to give you an appreciation and conceptual understanding of some basic statistical

More information

Lecture Outline Biost 517 Applied Biostatistics I

Lecture Outline Biost 517 Applied Biostatistics I Lecture Outline Biost 517 Applied Biostatistics I Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture 2: Statistical Classification of Scientific Questions Types of

More information