Modifying ROC Curves to Incorporate Predicted Probabilities

Similar documents
Learning Decision Trees Using the Area Under the ROC Curve

A scored AUC Metric for Classifier Evaluation and Selection

A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range

INTRODUCTION TO MACHINE LEARNING. Decision tree learning

METHODS FOR DETECTING CERVICAL CANCER

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Introduction to ROC analysis

Machine learning II. Juhan Ernits ITI8600

1 Introduction. st0020. The Stata Journal (2002) 2, Number 3, pp

Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features.

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

Technical Specifications

SUPPLEMENTARY INFORMATION In format provided by Javier DeFelipe et al. (MARCH 2013)

Receiver operating characteristic

Classical Psychophysical Methods (cont.)

Week 2 Video 3. Diagnostic Metrics

Derivative-Free Optimization for Hyper-Parameter Tuning in Machine Learning Problems

Knowledge Discovery and Data Mining. Testing. Performance Measures. Notes. Lecture 15 - ROC, AUC & Lift. Tom Kelsey. Notes

The area under the ROC curve as a variable selection criterion for multiclass classification problems

Outlier Analysis. Lijun Zhang

When Overlapping Unexpectedly Alters the Class Imbalance Effects

MCAS Equating Research Report: An Investigation of FCIP-1, FCIP-2, and Stocking and. Lord Equating Methods 1,2

PCA Enhanced Kalman Filter for ECG Denoising

Evaluation of diagnostic tests

Contents. What is item analysis in general? Psy 427 Cal State Northridge Andrew Ainsworth, PhD

Conclusions and future directions

Chapter 1. Introduction

Module Overview. What is a Marker? Part 1 Overview

Statistics, Probability and Diagnostic Medicine

VU Biostatistics and Experimental Design PLA.216

Detection of Mild Cognitive Impairment using Image Differences and Clinical Features

Sensitivity, specicity, ROC

An Empirical and Formal Analysis of Decision Trees for Ranking

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

Lesson 9 Presentation and Display of Quantitative Data

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Unit 1 Exploring and Understanding Data

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Comparing Two ROC Curves Independent Groups Design

4. Model evaluation & selection

1 Diagnostic Test Evaluation

Methods Research Report. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

Search settings MaxQuant

Analysis and Interpretation of Data Part 1

Lecture Outline. Biost 517 Applied Biostatistics I. Purpose of Descriptive Statistics. Purpose of Descriptive Statistics

Sample Size Considerations. Todd Alonzo, PhD

Bayesian Models for Combining Data Across Subjects and Studies in Predictive fmri Data Analysis

EEG Signal Description with Spectral-Envelope- Based Speech Recognition Features for Detection of Neonatal Seizures

Data Mining in Bioinformatics Day 4: Text Mining

METHOD VALIDATION: WHY, HOW AND WHEN?

Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta GA, USA.

Comparing Multifunctionality and Association Information when Classifying Oncogenes and Tumor Suppressor Genes

LUNG CANCER continues to rank as the leading cause

Data Mining and Knowledge Discovery: Practice Notes

Various performance measures in Binary classification An Overview of ROC study

Using AUC and Accuracy in Evaluating Learning Algorithms

Protein Structure & Function. University, Indianapolis, USA 3 Department of Molecular Medicine, University of South Florida, Tampa, USA

Diagnostic tests, Laboratory tests

WDHS Curriculum Map Probability and Statistics. What is Statistics and how does it relate to you?

How to interpret scientific & statistical graphs

Study on perceptually-based fitting line-segments

A Race Model of Perceptual Forced Choice Reaction Time

Selection and Combination of Markers for Prediction

Bayesian Inference. Thomas Nichols. With thanks Lee Harrison

ROC (Receiver Operating Characteristic) Curve Analysis

ANALYSIS AND DETECTION OF BRAIN TUMOUR USING IMAGE PROCESSING TECHNIQUES

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

About OMICS International

Predicting Breast Cancer Survivability Rates

Assessing Agreement Between Methods Of Clinical Measurement

Predicting Diabetes and Heart Disease Using Features Resulting from KMeans and GMM Clustering

Conditional Outlier Detection for Clinical Alerting

Internal Model Calibration Using Overlapping Data

2012 Course: The Statistician Brain: the Bayesian Revolution in Cognitive Sciences

References. Christos A. Ioannou 2/37

Ensembles for Unsupervised Outlier Detection: Challenges and Research Questions

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Cognitive Modeling. Lecture 12: Bayesian Inference. Sharon Goldwater. School of Informatics University of Edinburgh

Lecture 1 An introduction to statistics in Ichthyology and Fisheries Science

A Race Model of Perceptual Forced Choice Reaction Time

3. Model evaluation & selection

OBSERVATIONAL MEDICAL OUTCOMES PARTNERSHIP

Toward Comparison-based Adaptive Operator Selection

Using Genetic Algorithms to Optimise Rough Set Partition Sizes for HIV Data Analysis

Zheng Yao Sr. Statistical Programmer

List of Figures. List of Tables. Preface to the Second Edition. Preface to the First Edition

Summary. Frequent offenders: specialists or not?

Investigating the Reliability of Classroom Observation Protocols: The Case of PLATO. M. Ken Cor Stanford University School of Education.

An Introduction to ROC curves. Mark Whitehorn. Mark Whitehorn

ROC Curves. I wrote, from SAS, the relevant data to a plain text file which I imported to SPSS. The ROC analysis was conducted this way:

SYSTEMATIC REVIEWS OF TEST ACCURACY STUDIES

Knowledge Discovery and Data Mining I

Equating UDS Neuropsychological Tests: 3.0>2.0, 3.0=2.0, 3.0<2.0? Dan Mungas, Ph.D. University of California, Davis

CSE 255 Assignment 9

Validating Machine-learned Diagnostic Classifiers in Safety Critical Applications with Imbalanced Populations

Grading of Vertebral Rotation

Lecture 12: Normal Probability Distribution or Normal Curve

Assignment 4: True or Quasi-Experiment

Supplemental Figures. Figure S1: 2-component Gaussian mixture model of Bourdeau et al. s fold-change distribution

Transcription:

Modifying ROC Curves to Incorporate Predicted Probabilities C. Ferri, P. Flach 2, J. Hernández-Orallo, A. Senad Departament de Sistemes Informàtics i Computació Universitat Politècnica de València Spain 2 Department of Computer Science University of Bristol U.K. ROCML 05, Bonn, August, 2005.

Outline Motivation ROC Analysis PAUC measure proc Curves Properties and Applications of proc Curves Conclusions and Future Work 2

Motivation Cost-sensitive Learning is a more realistic generalisation of predictive learning: Costs are not the same for all kinds of misclassifications. Class distributions are usually unbalanced. Receiver Operating Characteristic (ROC) Analysis: Useful for choosing classifiers when costs are not known in advance. AUC (Area Under the ROC Curve): A simple measure for each classifier, which estimates: The quality of the classifier for a range of class distributions. A measure of how well the classifier ranks examples (equivalent to the Wilcoxon statistic) 3

Motivation Applications: ROC analysis and AUC-related measures have been used in many areas: medical decision making, marketing campaign design, probability estimation, etc. Problems: AUC ignores the probability values, and it only takes the order into account Other measures (MSE, LogLoss) consider how well the probabilities are calibrated, but not its order. Goal: Modify ROC analysis in order to consider the probabilities of the instances as well as the order 4

ROC Analysis ROC Analysis is useful when we don t know in learning time: The proportion of examples of each class (class distribution). The cost matrix. ROC Analysis can be applied in these situations. Provides tools to: Distinguish classifiers that can be discarded under any circumstance (class distribution or cost matrix). Select the optimal classifier once the cost matrix is known. 5

ROC Analysis Given a confusion matrix: Real Predicted Yes No Yes 30 20 No 0 40 ROC diagram We can normalise each column Real Predicted Yes No TPR Yes 0.75 0.33 No 0.25 0.67 FPR TPR 0 0 FPR 6

ROC Analysis Given several classifiers: TPR 0 ROC diagram 0 FPR Trivial classifiers We can construct the convex hull of their points (FPR,TPR) and the trivial classifiers (0,0), (,), (,0). The classifiers falling under the ROC curve can be discarded. The best classifier of the remaining classifiers can be chosen in application time. 7

ROC Analysis If we want to select one classifier: Classifier with greatest AUC TPR ROC diagram AUC 0 0 FPR We calculate the Area Under the ROC Curve (AUC) of all the classifiers and choose the one with greatest AUC. 8

AUC Wu and Flach method to compute AUC for each positive on the sorted list, accumulate how many negatives follow it. Alternatively, we can accumulate for each negative how many positives precede it Positive examples TPR 0.9 0.6 0.55 0.2 0. 0 P + AUC = 0.8333 Negative examples FPR 9

AUC AUC completely ignores the probability values, it only takes order into account Positive examples Positive examples 0.50 0.5 Negative examples 0 P+ 0.5 0.499 Negative examples 0 P+ TPR AUC = TPR AUC = 0 FPR Small changes in probabilities can give rise to large changes in AUC FPR 0

pauc We can incorporate probabilities into AUC: pgini = pauc = P(x) x Pos x P(y) y Θ Neg P( x) P( y) P( x) P( y) y Θ x y Θ + + Pos Neg Pos Neg = 2 2 P(x) and P(y) are the predicted probability of positive instance x and negative instance y to belong to the positive class Pos and Neg are the total numbers of positive and negative examples

proc curves Has pauc has a representation in the form of a curve? We study a method based on a representation of segments (with a uniform or normal distribution). The key idea is to consider probabilities as intervals of a fixed width d. d Positive examples d 0.9 0.6 0.55 0.2 0. 0 P+ TPR Area(d) d d Negative examples FPR 2

which is the width d to use? proc curves If d = 0 Area(d) = AUC If d=.596 Area(d)= pauc If d Area(d) 0.5 3

Properties of proc curves Which are the theoretical properties of proc curves? For any set of examples labelled by probabilities and true labels, if d = 0 Area(d) = AUC For any set of examples labelled by probabilities and true labels, if d Area(d) 0.5 Area(d) is continuous For any set of examples labelled by probabilities and true labels, there exists a real number d 0 such that Area(d) = pauc 4

Properties of proc curves We have shown that there exists a d such that Area(d) = pauc. However, is this d unique? Consider the ranking {+, +, 0.6-, 0.6-, 0.5000+, 0.49999-, 0.45+, 0.45+, 0-, 0-}. The AUC of this ranking is 0.68. The pauc is approximately 0.67. 0,8 0,7 0,6 PAUC 0,5 0,4 0,3 Area(d) pauc 0,2 0, 0 0 0,9 0,38 0,57 0,76 0,95,5,34,53,72,9 2, 2,29 2,48 d 2,67 2,86 3,06 3,25 3,44 3,63 3,82 4,0 4,2 4,39 4,58 4,77 4,97 5

proc curves with normally distributions A more natural option is to consider a normal (Gaussian) distribution centered on the probability The factor to gauge is also called d, but represents the double of the standard deviation TPR Area(d) The results are similar (both theoretically and practically). Additionally some of the curves are expected to be smoother FPR 6

Interpretation of proc curves They smooth the ROC curves, especially when differences between probabilities of consecutive examples of different class are small The value is found globally, so it can give high overlap in some areas but small overlap in others The greater the number of examples, the proc curve has a closer (but smoother) correspondence to the original ROC curve The differences are important when ROC curves are not very reliable: few examples and small differences in probabilities 7

proc curves for establishing thresholds We can employ proc curves for establishing thresholds according to some application skew Given the skew at application time, we get the slope of an iso-accuracy line, which gives us the best threshold (or the threshold interval). 8

proc curves for establishing thresholds if we use the proc curves, we have a different picture: Skew (Slope) FPR/TPR ROC curve Threshold in ROC curve FPR/TPR in proc curve Threshold in proc using the original prob. 60º 0% / 50% [0.9, 0.8] 9% / 9% [0.9, 0.8] 45º 33% / 00% [0.6, 0.3] 40% / 58% [0.8, 0.6] 30º 33% / 00% [0.6, 0.3] 76% / 95% [0.3, 0.2] The FPR/TPR are much more gradual in the proc curve than for the original ROC curve. 9

Conclusions AUC has been promoted as a more suitable alternative for evaluating classifiers AUC is only concentrated in how the model ranks the examples pauc contemplates the two aspects (ranking and probabilities) proc curves as a graphical representation of classifier performance. 20

Conclusions and Current Work proc curves help to know different details about the performance of the evaluated model wrt classical ROC curves do Application of proc curves for establishing classification thresholds. In this context, we are performing experiments comparing proc and ROC curves. Compare theoretically and experimentally (as a selection measure) pauc wrt other measures sauc, MSE, LogLoss.. 2