ROC Curve. Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341)

Similar documents
Bivariate Correlations

Psychology of Perception Psychology 4165, Fall 2001 Laboratory 1 Weight Discrimination

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Psychology of Perception Psychology 4165, Spring 2003 Laboratory 1 Weight Discrimination

Toxicity testing. Introduction

Psy201 Module 3 Study and Assignment Guide. Using Excel to Calculate Descriptive and Inferential Statistics

1.4 - Linear Regression and MS Excel

UNIT 1CP LAB 1 - Spaghetti Bridge

Q: How do I get the protein concentration in mg/ml from the standard curve if the X-axis is in units of µg.

Protein quantitation guidance (SXHL288)

Paper Airplanes & Scientific Methods

ANOVA in SPSS (Practical)

Section 3 Correlation and Regression - Teachers Notes

Daniel Boduszek University of Huddersfield

Name Psychophysical Methods Laboratory

Warfarin Help Documentation

Statistics, Probability and Diagnostic Medicine

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

Intro to SPSS. Using SPSS through WebFAS

1 Diagnostic Test Evaluation

RAG Rating Indicator Values

Item Analysis: Classical and Beyond

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Unit 1 Exploring and Understanding Data

Comparing Two ROC Curves Independent Groups Design

Classical Psychophysical Methods (cont.)

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

EVALUATION AND COMPUTATION OF DIAGNOSTIC TESTS: A SIMPLE ALTERNATIVE

METHODS FOR DETECTING CERVICAL CANCER

Technical Bulletin. Technical Information for Quidel Molecular Influenza A+B Assay on the Bio-Rad CFX96 Touch

Part III Taking Chances for Fun and Profit

The North Carolina Health Data Explorer

Psychological testing

SPSS Correlation/Regression

Two-Way Independent Samples ANOVA with SPSS

Regression Discontinuity Design

Psychology of Perception Psychology 4165, Spring 2015 Laboratory 1 Noisy Representations: The Oblique Effect

7. Bivariate Graphing

Paper Airplanes & Scientific Methods

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes

Supplementary Information

[HEALTH MAINTENANCE AND INVITATIONS/REMINDERS] June 2, 2014

MODULE: SQUAT JUMP BRIEF DESCRIPTION:

FATIGUE. A brief guide to the PROMIS Fatigue instruments:

ANXIETY. A brief guide to the PROMIS Anxiety instruments:

Introduction to ROC analysis

ESC-100 Introduction to Engineering Fall 2013

Chapter 4: Scatterplots and Correlation

MA 151: Using Minitab to Visualize and Explore Data The Low Fat vs. Low Carb Debate

BIOL 458 BIOMETRY Lab 7 Multi-Factor ANOVA

Titrations in Cytobank

Data and Statistics 101: Key Concepts in the Collection, Analysis, and Application of Child Welfare Data

Diagnostic screening. Department of Statistics, University of South Carolina. Stat 506: Introduction to Experimental Design

USE OF DIFFERENTIAL ITEM FUNCTIONING (DIF) ANALYSIS FOR BIAS ANALYSIS IN TEST CONSTRUCTION

Probability and Statistics. Chapter 1

How to Use InSituGram

Method Comparison Report Semi-Annual 1/5/2018

The Effectiveness of Captopril

ROC Curves. I wrote, from SAS, the relevant data to a plain text file which I imported to SPSS. The ROC analysis was conducted this way:

Making charts in Excel

Estimating national adult prevalence of HIV-1 in Generalized Epidemics

Predicting Breast Cancer Survivability Rates

Chapter 3: Examining Relationships

Receiver operating characteristic

CHAPTER III METHODOLOGY

GST: Step by step Build Diary page

Using the Patient Reported Outcome Measures Tool (PROMT)

Estimation of Area under the ROC Curve Using Exponential and Weibull Distributions

VU Biostatistics and Experimental Design PLA.216

Examining differences between two sets of scores

Charts Worksheet using Excel Obesity Can a New Drug Help?

Section 6: Analysing Relationships Between Variables

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

LAB 1 The Scientific Method

SUPPLEMENTAL MATERIAL

Zheng Yao Sr. Statistical Programmer

PAIN INTERFERENCE. ADULT ADULT CANCER PEDIATRIC PARENT PROXY PROMIS-Ca Bank v1.1 Pain Interference PROMIS-Ca Bank v1.0 Pain Interference*

Put Your Best Figure Forward: Line Graphs and Scattergrams

SAFETY & DISPOSAL onpg is a potential irritant. Be sure to wash your hands after the lab.

Introduction to screening tests. Tim Hanson Department of Statistics University of South Carolina April, 2011

Predicting Diabetes and Heart Disease Using Features Resulting from KMeans and GMM Clustering

BlueBayCT - Warfarin User Guide

Allergy Basics. This handout describes the process for adding and removing allergies from a patient s chart.

ShadeVision v Color Map

CHAPTER ONE CORRELATION

White Paper Estimating Complex Phenotype Prevalence Using Predictive Models

Chapter 7: Descriptive Statistics

AP STATISTICS 2010 SCORING GUIDELINES

Chapter 1: Alternative Forced Choice Methods

Diabetes Management Software V1.3 USER S MANUAL

VCL 17-2 Weak Acid-Strong Base Titrations

INTRODUCTION TO ASSESSMENT OPTIONS

Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA.

FREQUENCY COMPRESSION AND FREQUENCY SHIFTING FOR THE HEARING IMPAIRED

Test 1C AP Statistics Name:

What Is the Relationship Between the Amount of Transmitted Light Through a Solution and Its Concentration?

IAPT: Regression. Regression analyses

Nature Neuroscience: doi: /nn Supplementary Figure 1. Task timeline for Solo and Info trials.

Transcription:

ROC Curve Brawijaya Professional Statistical Analysis BPSA MALANG Jl. Kertoasri 66 Malang (0341) 580342

ROC Curve The ROC Curve procedure provides a useful way to evaluate the performance of classification schemes that categorize cases into one of two groups. Using ROC Curve to Evaluate Assay Performance A pharmaceutical lab is trying to develop a rapid assay for detecting HIV infection. The delay in obtaining results for traditional tests reduces their effectiveness because many patients don't return to learn the results. The challenge is to develop a test that provides results in 10 to 15 minutes and is as accurate as the traditional tests. The results of the assay are eight deepening shades of red, with deeper shades indicating greater likelihood of infection. The test is fast, but is it accurate? To help answer this question, a laboratory trial was conducted on 2000 blood samples, half of which were infected with HIV and half of which were clean. The results are collected in hivassay.sav. Use ROC Curve to determine at what shade the physician should assume that the patient is HIV-positive. Running the Analysis To run a ROC Curve analysis, from the menus choose: Graphs ROC Curve... 1- ROC Curve

Select Assay result as the test variable. Select Actual state as the state variable and type 1 as its positive value. Select With diagonal reference line, Standard error and confidence interval, and Coordinate points of the ROC Curve. Click OK. These selections produce an ROC curve for how well Assay result reflects Actual state, an estimate of the area under the curve, and a table of the curve's coordinates. 2- ROC Curve

ROC Curve The ROC curve is a visual index of the accuracy of the assay. The further the curve lies above the reference line, the more accurate the test. Here, the curve is difficult to see because it lies close to the vertical axis. To show more of the detail in the ROC curve, activate the plot by double clicking on it. In the Chart Editor window, activate the vertical axis by double clicking on it. Type 0.9 as the minimum displayed value. Click OK. In the Chart Editor window, activate the horizontal axis by double clicking on it. 3- ROC Curve

Type 0.1 as the maximum displayed value. Click OK. For a numerical summary, see the area under the curve. 4- ROC Curve

Area under the Curve The area under the curve represents the probability that the assay result for a randomly chosen positive case will exceed the result for a randomly chosen negative case. The asymptotic significance is less than 0.05, which means that using the assay is better than guessing. While the area under the curve is a useful one-statistic summary of the accuracy of the assay, you need to be able to choose a specific criterion by which blood samples are classified and estimate the sensitivity and specificity of the assay under that criterion. See the coordinates of the curve to compare different cutoffs. Coordinates of the Curve 5- ROC Curve

This table reports the sensitivity and 1-specificity for every possible cutoff for positive classification. The sensitivity is the proportion of HIV-positive samples with assay results greater than the cutoff. 1-specificity is the proportion of HIV-negative samples with assay results greater than the cutoff. Cutoff 0 is equivalent to assuming that everyone is HIV-positive. Cutoff 9 is equivalent to assuming that everyone is HIV-negative. Both extremes are unsatisfactory, and the challenge is to select a cutoff that properly balances the needs of sensitivity and specificity. For example, consider cutoff 5.5. Using this criterion, assay results of 6, 7, or 8 are classified as positive, which leads to a sensitivity of 0.989 and 1-specificity of 0.012. Thus, approximately 98.9% of all HIV-positive samples would be correctly identified as such, and 1.2% of all HIV-negative samples would be incorrectly identified as positive. If 2.5 is used as the cutoff, 99.7% of all HIV-positive samples would be correctly identified as such, and 3.3% of all HIV-negative samples would be incorrectly identified as positive. Your choice of cutoff will be mandated by the need to closely match the sensitivity and specificity of traditional tests. Please note that the values in this table are at best guidelines for which cutoffs you should consider. This table does not contain error estimates, so there is no guarantee of the accuracy of the sensitivity or specificity for a given cutoff in the table. Summary Using ROC Curve, you have analyzed the accuracy of this assay. The curve and area under the curve table reveal that using the assay is better than guessing, but you're not really interested in simply doing better than guessing. In this example, the coordinates of the curve were most helpful because they provided some guidance for determining what shade should serve as the cutoff for determining positive and negative assay results. Using ROC Curves to Choose between Competing Classification Schemes A bank is interested in predicting whether or not customers will default on loans. You have proposed a model that is based on a subset of the available predictors, and you now need to show that its results are better than those from a simpler model currently in use and no worse than results from a more complex model. The model results are collected in bankloan.sav. Use ROC Curve to compare the predictive abilities of these three models. 6- ROC Curve

Running the Analysis To run a ROC Curve analysis, from the menus choose: Graphs ROC Curve... Select Predicted default, model 1, Predicted default, model 2, and Predicted default, model 3 as test variables. Select Previously defaulted as the state variable and type 1 as its positive value. Select With diagonal reference line and Standard error and confidence interval. Click OK. These selections produce ROC curves for how well the three models predict loan default and estimates of the areas under each curve. 7- ROC Curve

8- ROC Curve

ROC Curve Based on their distances from the reference line, all three models are doing better than guessing. Model 3 does not appear to be as good as the others. Area under the Curve The asymptotic significance of each model is less than 0.05, so all are doing better than guessing. From the confidence intervals, you can see that model 3 is inferior to the other two because the entirety of its interval lies below the others. Models 1 and 2 are nearly indistinguishable, so your model is adopted because it requires less input to provide a recommendation. Summary Using ROC Curve, you have created multiple curves in order to compare three competing classification models. The plot of the curves offers an excellent visual comparison of the models' performances, and the area under the curve table gives you the numbers to back up your conclusions from the plot. Here, the coordinates of the curve are not as useful 9- ROC Curve

because the test result variables have many values, so the table would be very long and unwieldy. Related Procedures Using a classification table, you can compute the sensitivity and specificity for a single cutoff value. ROC Curve gives you a visual display of these measures for all possible cutoffs in a single plot, which is much cleaner and more powerful than a series of tables. The Discriminant Analysis procedure produces classification models that can be compared using ROC curves. Recommended Readings See the following texts for more information on ROC curves (for complete bibliographic information, hover over the reference): Hanley, J. A., and B. J. McNeil. 1982. The Meaning and Use of the Area Under a Receiver Operating Characteristic (ROC) Curve. Radiology, 143:1, 29-36. Zweig, M. H., and G. Campbell. 1993. Receiver Operating Characteristic (ROC) Plots: A Fundamental Evaluation Tool in Clinical Medicine. Clinical Chemistry, 39:4, 561-577. 10- ROC Curve