3. Model evaluation & selection

Size: px
Start display at page:

Download "3. Model evaluation & selection"

Transcription

1 Foundations of Machine Learning CentraleSupélec Fall Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech

2 Practical maters Scribes One person signed up for today. Anyone wants to assist her? Two people signed up for next week. Congrats! No one signed up after that.

3 Generalization error vs. model complexity High bias Low variance Low bias High variance Prediction error On new data On training data Model complexity 3

4 Generalization error vs. model complexity Underfitting Overfitting Prediction error On new data On training data Model complexity 4

5 Model selection & generalization Well-posed problems: a solution exists; it is unique; Hadamard, on the mathematical modelisation of physical phenomena. the solution changes continuously with the initial conditions Learning is an ill-posed problem: data helps carve out the hypothesis space but data is not sufficient to find a unique solution. Need for inductive bias assumptions about H model selection: choose the right inductive bias? 5

6 How do we decide a model is good? 6

7 Learning objectives After this lecture you should be able to design experiments to select and evaluate supervised machine learning models. Concepts: training and testing sets; cross-validation; bootstrap; measures of performance for classifiers and regressors; measures of model complexity. 7

8 Training set: Supervised learning setting Classification: Regression: Goal: Find f,θ such that approximates y i. Empirical error of f on the training set, given a loss: E.g. (classification) E.g. (regression) 8

9 Validation sets Choose the model that performs best on a validation set separate from the training set. Training Validation Model selection: pick the best model. Model assessment: estimate its prediction error on new data. Training Validation Test 9

10 How much data should go in each of the training, validation and test sets? How do we know we have enough data to evaluate the prediction and generalization errors? Sample re-use cross-validation bootstrap Analytical tools Mallow's Cp, AIC, BIC MDL SRM. 10

11 Sample re-use 11

12 Cross-validation Cut the training set in k separate folds. For each fold, train on the (k-1) remaining folds. Validation Validation Training Validation Training Training Validation Training Validation 12

13 Cross-validated performance Cross-validation estimate of the prediction error Computed with the k(i)-th part of the data removed. k(i) = fold in which i is. Estimates the expected prediction error Y, X: (independent) test sample 13

14 Issues with cross-validation Training set size becomes (K-1)n/K Why is this a problem? 14

15 Issues with cross-validation Training set size becomes (K-1)n/K small training set biased estimator of the error Leave-one-out cross-validation: K = n approximately unbiased estimator of the expected prediction error potential high variance (the training sets are very similar to each other) computation can become burdensome (n repeats) In practice: set K = 5 or K =

16 Bootstrap Randomly draw datasets with replacement from the training data Repeat B times (typically, B=100) B models Leave-one-out bootstrap error: For each training point i, predict with the b i < B models that did not have i in their training set Average prediction errors What is the size of the training sets? 16

17 Bootstrap Randomly draw datasets with replacement from the training data Repeat B times (typically, B=100) B models Leave-one-out boostrap error: For each training point i, predict with the b i < B models that did not have i in their training set Average prediction errors Each training set contains n examples same issue as with cross-validation 17

18 Evaluating model performance 18

19 Classification model evaluation Confusion matrix True class Predicted class -1 True Negatives False Negatives +1 False Positives True Positives False positives (false alarms) are also called type I errors False negatives (misses) are also called type II errors 19

20 Sensitivity = Recall = True positive rate (TPR) # positives Specificity = True negative rate (TNR) Precision = Positive predictive value (PPV) False discovery rate (FDR) # predicted positives 20

21 Accuracy F1-score = harmonic mean of precision and sensitivity. 21

22 Example: Pap smear 4,000 apparently healthy women of age 40+ Tested for cervical cancer through pap smear and histology (gold standard) Cancer No cancer Total Positive test Negative test Total What are the sensitivity, specificity, and PPV of the test? 22

23 Sensitivity = Recall = True positive rate (TPR) Specificity = True negative rate (TNR) Precision = Positive predictive value (PPV) Cancer No cancer Total Positive test Negative test Total

24 In this population: Sensitivity = 95.0 % Specificity = 94.5 % PPV = 47.5 % Cancer No cancer Total Positive test Negative test Total Prevalence of the disease = 200/4000 = 0.05 P(cancer positive test) = PPV = 47.5 % P(no cancer negative test) = 3590/3600 = 99.7 % Poor diagnosis tool Good screening tool 24

25 ROC curves ROC = Receiver-Operator Characteristic. Summarized by the area under the curve (AUROC). 1 True positive rate Plot TPR vs FPR for all possible thresholds. threshold =? False positive rate 1 25

26 ROC curves ROC = Receiver-Operator Characteristic. Summarized by the area under the curve (AUROC). 1 True positive rate threshold =? Plot TPR vs FPR for all possible thresholds. threshold = smallest predicted value. False positive rate 1 26

27 ROC curves ROC = Receiver-Operator Characteristic. Summarized by the area under the curve (AUROC). 1 True positive rate threshold = largest predicted value. False positive rate 1 Plot TPR vs FPR for all possible thresholds. threshold = smallest predicted value. What is the ROC curve of: - a random classifier? - a perfect classifier? 27

28 ROC curves ROC = Receiver-Operator Characteristic. Summarized by the area under the curve (AUROC). 1 Perfect classifier Perfect classifier: True positive rate random classifier AUROC = 1.0 Random classifier: AUROC = 0.5 Our classifier: 0.5 < AUROC < 1.0 False positive rate 1 28

29 Predicting breast cancer risk based on mammography images, SNPs, or both. Liu J, Page D, Nassif H, et al. (2013). Genetic Variants Improve Breast Cancer Risk Prediction on Mammograms. AMIA Annual Symposium Proceedings = 1 - FPR Which method outperforms the others? Is a low FPR or high TPR preferable in a clinical setting? 29

30 Predicting breast cancer risk based on mammography images, SNPs, or both. Liu J, Page D, Nassif H, et al. (2013). Genetic Variants Improve Breast Cancer Risk Prediction on Mammograms. AMIA Annual Symposium Proceedings = 1 - FPR High recall = fewer chances to miss a case High specificity / low FPR = fewer false alarms 30

31 Precision-Recall curves Sensitivity = Recall = True positive rate (TPR) 1 Good corner Precision = Positive predictive value (PPV) Precision Bad corner Recall 1 31

32 Predicting breast cancer risk based on mammography images, SNPs, or both. Liu J, Page D, Nassif H, et al. (2013). Genetic Variants Improve Breast Cancer Risk Prediction on Mammograms. AMIA Annual Symposium Proceedings Sensitivity = Recall = True positive rate (TPR) Precision = Positive predictive value (PPV) Which method has the highest area under the PR curve? Is a high recall or high precision preferable in a clinical setting? 32

33 Predicting breast cancer risk based on mammography images, SNPs, or both. Liu J, Page D, Nassif H, et al. (2013). Genetic Variants Improve Breast Cancer Risk Prediction on Mammograms. AMIA Annual Symposium Proceedings Sensitivity = Recall = True positive rate (TPR) Precision = Positive predictive value (PPV) High recall = fewer chances to miss a case High precision = substantially more true diagnoses than false alarms 33

34 Regression model evaluation Root-mean squared error Relative squared error Coefficient of determination Residual sum of squares 34

35 Analytical tools and model complexity 35

36 Penalizing model complexity augmented error: E' = empirical error + λ model complexity If λ is small, models that fit the training data well are encouraged (risk of introducing variance). If λ is large, simpler models are encouraged (risk of introducing bias). λ can be set by cross-validation in some cases (cf Chap. 6), it is possible to estimate E' for all values of λ 36

37 Cp, AIC and BIC augmented error: E' = empirical error + optimism term The optimism term estimates the discrepancy between training and test error without any need for cross-validation: Mallow's Cp (Linear regression + squared error) empirical error # parameters used estimate of the error variance Akaike's Information Criterion (AIC) Bayesian Information Criterion (BIC) 37

38 Minimum description length (MDL) Shortest code to transmit a random variable z log P(z) [Shannon's information theory] Assume receiver knows inputs X, model f. To transmit outputs Y, need log P(y θ, f, X) log P(θ f) average code length to transmit θ. average code length to transmit the difference between model prediction and true outputs. Choose model with smallest length. 38

39 Structural risk minimization (SRM) Fit a nested sequence of models of increasing VC dimensions h1 < h2 < Pick the one with lower bound on test error E.g. Regression: with probability at least (1 η), VC-dimension What happens when n gets larger? 39

40 Summary: model selection techniques Cross-validation: estimate generalization accuracy empirically Regularization: Penalize complex models E' = empirical error + λ model complexity Mallow's Cp, Akaike's / Bayesian Information Criteria Minimum description length (MDL) Kolmogorov complexity = shortest description of data [Information theory] Structural risk minimization (SRM) Order models by complexity polynomes of degree; values of λ Bayesian model selection 40

41 Python: scikit-learn ML Toolboxes R: Machine Learning Task View Matlab : Machine Learning with MATLAB Statistics and Machine Learning Toolbox Neural Network Toolbox 41

42 Getting started with Python I highly recommend 42

43 References Linear algebra: Statistics & probabilities: Probability theory: A primer (Jeremy Kun) Probability Primer (Jeffrey Miller) More on entropy encoding: Textbook: The Elements of Statistical Learning Hastie, Tibshirani, Friedman (2009) 43

44 Practical maters Make sure you have handed in HW1 HW2 is online, due Sep. 21 Lab 44

4. Model evaluation & selection

4. Model evaluation & selection Foundations of Machine Learning CentraleSupélec Fall 2017 4. Model evaluation & selection Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and

More information

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance

Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction

More information

INTRODUCTION TO MACHINE LEARNING. Decision tree learning

INTRODUCTION TO MACHINE LEARNING. Decision tree learning INTRODUCTION TO MACHINE LEARNING Decision tree learning Task of classification Automatically assign class to observations with features Observation: vector of features, with a class Automatically assign

More information

METHODS FOR DETECTING CERVICAL CANCER

METHODS FOR DETECTING CERVICAL CANCER Chapter III METHODS FOR DETECTING CERVICAL CANCER 3.1 INTRODUCTION The successful detection of cervical cancer in a variety of tissues has been reported by many researchers and baseline figures for the

More information

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi

More information

Module Overview. What is a Marker? Part 1 Overview

Module Overview. What is a Marker? Part 1 Overview SISCR Module 7 Part I: Introduction Basic Concepts for Binary Classification Tools and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Week 2 Video 3. Diagnostic Metrics

Week 2 Video 3. Diagnostic Metrics Week 2 Video 3 Diagnostic Metrics Different Methods, Different Measures Today we ll continue our focus on classifiers Later this week we ll discuss regressors And other methods will get worked in later

More information

Learning from data when all models are wrong

Learning from data when all models are wrong Learning from data when all models are wrong Peter Grünwald CWI / Leiden Menu Two Pictures 1. Introduction 2. Learning when Models are Seriously Wrong Joint work with John Langford, Tim van Erven, Steven

More information

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers

SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers SISCR Module 7 Part I: Introduction Basic Concepts for Binary Biomarkers (Classifiers) and Continuous Biomarkers Kathleen Kerr, Ph.D. Associate Professor Department of Biostatistics University of Washington

More information

Predictive Models for Healthcare Analytics

Predictive Models for Healthcare Analytics Predictive Models for Healthcare Analytics A Case on Retrospective Clinical Study Mengling Mornin Feng mfeng@mit.edu mornin@gmail.com 1 Learning Objectives After the lecture, students should be able to:

More information

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11

Article from. Forecasting and Futurism. Month Year July 2015 Issue Number 11 Article from Forecasting and Futurism Month Year July 2015 Issue Number 11 Calibrating Risk Score Model with Partial Credibility By Shea Parkes and Brad Armstrong Risk adjustment models are commonly used

More information

Week 8 Hour 1: More on polynomial fits. The AIC. Hour 2: Dummy Variables what are they? An NHL Example. Hour 3: Interactions. The stepwise method.

Week 8 Hour 1: More on polynomial fits. The AIC. Hour 2: Dummy Variables what are they? An NHL Example. Hour 3: Interactions. The stepwise method. Week 8 Hour 1: More on polynomial fits. The AIC Hour 2: Dummy Variables what are they? An NHL Example Hour 3: Interactions. The stepwise method. Stat 302 Notes. Week 8, Hour 1, Page 1 / 34 Human growth

More information

Selection and Combination of Markers for Prediction

Selection and Combination of Markers for Prediction Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe

More information

Lec 02: Estimation & Hypothesis Testing in Animal Ecology

Lec 02: Estimation & Hypothesis Testing in Animal Ecology Lec 02: Estimation & Hypothesis Testing in Animal Ecology Parameter Estimation from Samples Samples We typically observe systems incompletely, i.e., we sample according to a designed protocol. We then

More information

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering

Meta-Analysis. Zifei Liu. Biological and Agricultural Engineering Meta-Analysis Zifei Liu What is a meta-analysis; why perform a metaanalysis? How a meta-analysis work some basic concepts and principles Steps of Meta-analysis Cautions on meta-analysis 2 What is Meta-analysis

More information

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018 Introduction to Machine Learning Katherine Heller Deep Learning Summer School 2018 Outline Kinds of machine learning Linear regression Regularization Bayesian methods Logistic Regression Why we do this

More information

A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range

A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range A Learning Method of Directly Optimizing Classifier Performance at Local Operating Range Lae-Jeong Park and Jung-Ho Moon Department of Electrical Engineering, Kangnung National University Kangnung, Gangwon-Do,

More information

An Improved Algorithm To Predict Recurrence Of Breast Cancer

An Improved Algorithm To Predict Recurrence Of Breast Cancer An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant

More information

Search settings MaxQuant

Search settings MaxQuant Search settings MaxQuant Briefly, we used MaxQuant version 1.5.0.0 with the following settings. As variable modifications we allowed Acetyl (Protein N-terminus), methionine oxidation and glutamine to pyroglutamate

More information

Sensitivity, specicity, ROC

Sensitivity, specicity, ROC Sensitivity, specicity, ROC Thomas Alexander Gerds Department of Biostatistics, University of Copenhagen 1 / 53 Epilog: disease prevalence The prevalence is the proportion of cases in the population today.

More information

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang

Classification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression

More information

What is Regularization? Example by Sean Owen

What is Regularization? Example by Sean Owen What is Regularization? Example by Sean Owen What is Regularization? Name3 Species Size Threat Bo snake small friendly Miley dog small friendly Fifi cat small enemy Muffy cat small friendly Rufus dog large

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 10: Introduction to inference (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 17 What is inference? 2 / 17 Where did our data come from? Recall our sample is: Y, the vector

More information

CSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression

CSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression CSE 258 Lecture 2 Web Mining and Recommender Systems Supervised learning Regression Supervised versus unsupervised learning Learning approaches attempt to model data in order to solve a problem Unsupervised

More information

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012

STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012 STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements

More information

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES

MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES OBJECTIVES 24 MULTIPLE LINEAR REGRESSION 24.1 INTRODUCTION AND OBJECTIVES In the previous chapter, simple linear regression was used when you have one independent variable and one dependent variable. This chapter

More information

Biostatistics II

Biostatistics II Biostatistics II 514-5509 Course Description: Modern multivariable statistical analysis based on the concept of generalized linear models. Includes linear, logistic, and Poisson regression, survival analysis,

More information

BMI 541/699 Lecture 16

BMI 541/699 Lecture 16 BMI 541/699 Lecture 16 Where we are: 1. Introduction and Experimental Design 2. Exploratory Data Analysis 3. Probability 4. T-based methods for continous variables 5. Proportions & contingency tables -

More information

Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features.

Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features. Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features. Mohamed Tleis Supervisor: Fons J. Verbeek Leiden University

More information

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

A Comparison of Collaborative Filtering Methods for Medication Reconciliation A Comparison of Collaborative Filtering Methods for Medication Reconciliation Huanian Zheng, Rema Padman, Daniel B. Neill The H. John Heinz III College, Carnegie Mellon University, Pittsburgh, PA, 15213,

More information

Week 2 Video 2. Diagnostic Metrics, Part 1

Week 2 Video 2. Diagnostic Metrics, Part 1 Week 2 Video 2 Diagnostic Metrics, Part 1 Different Methods, Different Measures Today we ll focus on metrics for classifiers Later this week we ll discuss metrics for regressors And metrics for other methods

More information

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India

Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision in Pune, India 20th International Congress on Modelling and Simulation, Adelaide, Australia, 1 6 December 2013 www.mssanz.org.au/modsim2013 Logistic Regression and Bayesian Approaches in Modeling Acceptance of Male Circumcision

More information

Deep learning and non-negative matrix factorization in recognition of mammograms

Deep learning and non-negative matrix factorization in recognition of mammograms Deep learning and non-negative matrix factorization in recognition of mammograms Bartosz Swiderski Faculty of Applied Informatics and Mathematics Warsaw University of Life Sciences, Warsaw, Poland bartosz_swiderski@sggw.pl

More information

Predicting Breast Cancer Survivability Rates

Predicting Breast Cancer Survivability Rates Predicting Breast Cancer Survivability Rates For data collected from Saudi Arabia Registries Ghofran Othoum 1 and Wadee Al-Halabi 2 1 Computer Science, Effat University, Jeddah, Saudi Arabia 2 Computer

More information

Applying Machine Learning Methods in Medical Research Studies

Applying Machine Learning Methods in Medical Research Studies Applying Machine Learning Methods in Medical Research Studies Daniel Stahl Department of Biostatistics and Health Informatics Psychiatry, Psychology & Neuroscience (IoPPN), King s College London daniel.r.stahl@kcl.ac.uk

More information

Business Statistics Probability

Business Statistics Probability Business Statistics The following was provided by Dr. Suzanne Delaney, and is a comprehensive review of Business Statistics. The workshop instructor will provide relevant examples during the Skills Assessment

More information

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Week 9 Hour 3 Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality Stat 302 Notes. Week 9, Hour 3, Page 1 / 39 Stepwise Now that we've introduced interactions,

More information

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008

Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008 Journal of Machine Learning Research 9 (2008) 59-64 Published 1/08 Response to Mease and Wyner, Evidence Contrary to the Statistical View of Boosting, JMLR 9:1 26, 2008 Jerome Friedman Trevor Hastie Robert

More information

Evaluation of diagnostic tests

Evaluation of diagnostic tests Evaluation of diagnostic tests Biostatistics and informatics Miklós Kellermayer Overlapping distributions Assumption: A classifier value (e.g., diagnostic parameter, a measurable quantity, e.g., serum

More information

Sensitivity, Specificity, and Relatives

Sensitivity, Specificity, and Relatives Sensitivity, Specificity, and Relatives Brani Vidakovic ISyE 6421/ BMED 6700 Vidakovic, B. Se Sp and Relatives January 17, 2017 1 / 26 Overview Today: Vidakovic, B. Se Sp and Relatives January 17, 2017

More information

Inferential Statistics

Inferential Statistics Inferential Statistics and t - tests ScWk 242 Session 9 Slides Inferential Statistics Ø Inferential statistics are used to test hypotheses about the relationship between the independent and the dependent

More information

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN

Review. Imagine the following table being obtained as a random. Decision Test Diseased Not Diseased Positive TP FP Negative FN TN Outline 1. Review sensitivity and specificity 2. Define an ROC curve 3. Define AUC 4. Non-parametric tests for whether or not the test is informative 5. Introduce the binormal ROC model 6. Discuss non-parametric

More information

Progress in Risk Science and Causality

Progress in Risk Science and Causality Progress in Risk Science and Causality Tony Cox, tcoxdenver@aol.com AAPCA March 27, 2017 1 Vision for causal analytics Represent understanding of how the world works by an explicit causal model. Learn,

More information

A Brief Introduction to Bayesian Statistics

A Brief Introduction to Bayesian Statistics A Brief Introduction to Statistics David Kaplan Department of Educational Psychology Methods for Social Policy Research and, Washington, DC 2017 1 / 37 The Reverend Thomas Bayes, 1701 1761 2 / 37 Pierre-Simon

More information

Prediction of Malignant and Benign Tumor using Machine Learning

Prediction of Malignant and Benign Tumor using Machine Learning Prediction of Malignant and Benign Tumor using Machine Learning Ashish Shah Department of Computer Science and Engineering Manipal Institute of Technology, Manipal University, Manipal, Karnataka, India

More information

Testing Statistical Models to Improve Screening of Lung Cancer

Testing Statistical Models to Improve Screening of Lung Cancer Testing Statistical Models to Improve Screening of Lung Cancer 1 Elliot Burghardt: University of Iowa Daren Kuwaye: University of Hawai i at Mānoa Iowa Summer Institute in Biostatistics - University of

More information

Intelligent Systems. Discriminative Learning. Parts marked by * are optional. WS2013/2014 Carsten Rother, Dmitrij Schlesinger

Intelligent Systems. Discriminative Learning. Parts marked by * are optional. WS2013/2014 Carsten Rother, Dmitrij Schlesinger Intelligent Systems Discriminative Learning Parts marked by * are optional 30/12/2013 WS2013/2014 Carsten Rother, Dmitrij Schlesinger Discriminative models There exists a joint probability distribution

More information

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms).

Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms). Problem 1) Match the terms to their definitions. Every term is used exactly once. (In the real midterm, there are fewer terms). 1. Bayesian Information Criterion 2. Cross-Validation 3. Robust 4. Imputation

More information

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method

Lecture Outline. Biost 590: Statistical Consulting. Stages of Scientific Studies. Scientific Method Biost 590: Statistical Consulting Statistical Classification of Scientific Studies; Approach to Consulting Lecture Outline Statistical Classification of Scientific Studies Statistical Tasks Approach to

More information

Machine learning II. Juhan Ernits ITI8600

Machine learning II. Juhan Ernits ITI8600 Machine learning II Juhan Ernits ITI8600 Hand written digit recognition 64 Example 2: Face recogition Classification, regression or unsupervised? How many classes? Example 2: Face recognition Classification,

More information

Behavioral Data Mining. Lecture 4 Measurement

Behavioral Data Mining. Lecture 4 Measurement Behavioral Data Mining Lecture 4 Measurement Outline Hypothesis testing Parametric statistical tests Non-parametric tests Precision-Recall plots ROC plots Hardware update Icluster machines are ready for

More information

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)

Chapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS) Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it

More information

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals

Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline

More information

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu

What you should know before you collect data. BAE 815 (Fall 2017) Dr. Zifei Liu What you should know before you collect data BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Types and levels of study Descriptive statistics Inferential statistics How to choose a statistical test

More information

Investigation of multiorientation and multiresolution features for microcalcifications classification in mammograms

Investigation of multiorientation and multiresolution features for microcalcifications classification in mammograms Investigation of multiorientation and multiresolution features for microcalcifications classification in mammograms Aqilah Baseri Huddin, Brian W.-H. Ng, Derek Abbott 3 School of Electrical and Electronic

More information

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U

A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U A Predictive Chronological Model of Multiple Clinical Observations T R A V I S G O O D W I N A N D S A N D A M. H A R A B A G I U T H E U N I V E R S I T Y O F T E X A S A T D A L L A S H U M A N L A N

More information

Assignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box

Assignment #6. Chapter 10: 14, 15 Chapter 11: 14, 18. Due tomorrow Nov. 6 th by 2pm in your TA s homework box Assignment #6 Chapter 10: 14, 15 Chapter 11: 14, 18 Due tomorrow Nov. 6 th by 2pm in your TA s homework box Assignment #7 Chapter 12: 18, 24 Chapter 13: 28 Due next Friday Nov. 13 th by 2pm in your TA

More information

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Gene Selection for Tumor Classification Using Microarray Gene Expression Data Gene Selection for Tumor Classification Using Microarray Gene Expression Data K. Yendrapalli, R. Basnet, S. Mukkamala, A. H. Sung Department of Computer Science New Mexico Institute of Mining and Technology

More information

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo

Describe what is meant by a placebo Contrast the double-blind procedure with the single-blind procedure Review the structure for organizing a memo Please note the page numbers listed for the Lind book may vary by a page or two depending on which version of the textbook you have. Readings: Lind 1 11 (with emphasis on chapters 10, 11) Please note chapter

More information

Supplementary materials for: Executive control processes underlying multi- item working memory

Supplementary materials for: Executive control processes underlying multi- item working memory Supplementary materials for: Executive control processes underlying multi- item working memory Antonio H. Lara & Jonathan D. Wallis Supplementary Figure 1 Supplementary Figure 1. Behavioral measures of

More information

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015

Introduction to diagnostic accuracy meta-analysis. Yemisi Takwoingi October 2015 Introduction to diagnostic accuracy meta-analysis Yemisi Takwoingi October 2015 Learning objectives To appreciate the concept underlying DTA meta-analytic approaches To know the Moses-Littenberg SROC method

More information

Chapter 5: Field experimental designs in agriculture

Chapter 5: Field experimental designs in agriculture Chapter 5: Field experimental designs in agriculture Jose Crossa Biometrics and Statistics Unit Crop Research Informatics Lab (CRIL) CIMMYT. Int. Apdo. Postal 6-641, 06600 Mexico, DF, Mexico Introduction

More information

Validating Machine-learned Diagnostic Classifiers in Safety Critical Applications with Imbalanced Populations

Validating Machine-learned Diagnostic Classifiers in Safety Critical Applications with Imbalanced Populations Validating Machine-learned Diagnostic Classifiers in Safety Critical Applications with Imbalanced Populations Daniel Wade 1, Dr. Andrew Wilson 2, Abraham Reddy 3, and Raj Bharadwaj 4 1,2 United States

More information

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach

Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen

More information

Information-theoretic stimulus design for neurophysiology & psychophysics

Information-theoretic stimulus design for neurophysiology & psychophysics Information-theoretic stimulus design for neurophysiology & psychophysics Christopher DiMattina, PhD Assistant Professor of Psychology Florida Gulf Coast University 2 Optimal experimental design Part 1

More information

Various performance measures in Binary classification An Overview of ROC study

Various performance measures in Binary classification An Overview of ROC study Various performance measures in Binary classification An Overview of ROC study Suresh Babu. Nellore Department of Statistics, S.V. University, Tirupati, India E-mail: sureshbabu.nellore@gmail.com Abstract

More information

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy

An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Number XX An Empirical Assessment of Bivariate Methods for Meta-analysis of Test Accuracy Prepared for: Agency for Healthcare Research and Quality U.S. Department of Health and Human Services 54 Gaither

More information

BayesOpt: Extensions and applications

BayesOpt: Extensions and applications BayesOpt: Extensions and applications Javier González Masterclass, 7-February, 2107 @Lancaster University Agenda of the day 9:00-11:00, Introduction to Bayesian Optimization: What is BayesOpt and why it

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the

More information

CSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression

CSE 258 Lecture 1.5. Web Mining and Recommender Systems. Supervised learning Regression CSE 258 Lecture 1.5 Web Mining and Recommender Systems Supervised learning Regression What is supervised learning? Supervised learning is the process of trying to infer from labeled data the underlying

More information

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T.

Introduction. We can make a prediction about Y i based on X i by setting a threshold value T, and predicting Y i = 1 when X i > T. Diagnostic Tests 1 Introduction Suppose we have a quantitative measurement X i on experimental or observed units i = 1,..., n, and a characteristic Y i = 0 or Y i = 1 (e.g. case/control status). The measurement

More information

COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION

COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION COMPARATIVE STUDY ON FEATURE EXTRACTION METHOD FOR BREAST CANCER CLASSIFICATION 1 R.NITHYA, 2 B.SANTHI 1 Asstt Prof., School of Computing, SASTRA University, Thanjavur, Tamilnadu, India-613402 2 Prof.,

More information

Ecological Statistics

Ecological Statistics A Primer of Ecological Statistics Second Edition Nicholas J. Gotelli University of Vermont Aaron M. Ellison Harvard Forest Sinauer Associates, Inc. Publishers Sunderland, Massachusetts U.S.A. Brief Contents

More information

Model Evaluation using Grouped or Individual Data. Andrew L. Cohen. University of Massachusetts, Amherst. Adam N. Sanborn and Richard M.

Model Evaluation using Grouped or Individual Data. Andrew L. Cohen. University of Massachusetts, Amherst. Adam N. Sanborn and Richard M. Model Evaluation: R306 1 Running head: Model Evaluation Model Evaluation using Grouped or Individual Data Andrew L. Cohen University of Massachusetts, Amherst Adam N. Sanborn and Richard M. Shiffrin Indiana

More information

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha

Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha Glossary From Running Randomized Evaluations: A Practical Guide, by Rachel Glennerster and Kudzai Takavarasha attrition: When data are missing because we are unable to measure the outcomes of some of the

More information

Model reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl

Model reconnaissance: discretization, naive Bayes and maximum-entropy. Sanne de Roever/ spdrnl Model reconnaissance: discretization, naive Bayes and maximum-entropy Sanne de Roever/ spdrnl December, 2013 Description of the dataset There are two datasets: a training and a test dataset of respectively

More information

Psychology, 2010, 1: doi: /psych Published Online August 2010 (

Psychology, 2010, 1: doi: /psych Published Online August 2010 ( Psychology, 2010, 1: 194-198 doi:10.4236/psych.2010.13026 Published Online August 2010 (http://www.scirp.org/journal/psych) Using Generalizability Theory to Evaluate the Applicability of a Serial Bayes

More information

Your Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect.

Your Task: Find a ZIP code in Seattle where the crime rate is worse than you would expect and better than you would expect. Forensic Geography Lab: Regression Part 1 Payday Lending and Crime Seattle, Washington Background Regression analyses are in many ways the Gold Standard among analytic techniques for undergraduates (and

More information

An Introduction to ROC curves. Mark Whitehorn. Mark Whitehorn

An Introduction to ROC curves. Mark Whitehorn. Mark Whitehorn An Introduction to ROC curves Mark Whitehorn Mark Whitehorn It s all about me Prof. Mark Whitehorn Emeritus Professor of Analytics Computing University of Dundee Consultant Writer (author) m.a.f.whitehorn@dundee.ac.uk

More information

Unit 1 Exploring and Understanding Data

Unit 1 Exploring and Understanding Data Unit 1 Exploring and Understanding Data Area Principle Bar Chart Boxplot Conditional Distribution Dotplot Empirical Rule Five Number Summary Frequency Distribution Frequency Polygon Histogram Interquartile

More information

Supplementary Materials

Supplementary Materials Supplementary Materials July 2, 2015 1 EEG-measures of consciousness Table 1 makes explicit the abbreviations of the EEG-measures. Their computation closely follows Sitt et al. (2014) (supplement). PE

More information

A Practical Approach for Implementing the Probability of Liquefaction in Performance Based Design

A Practical Approach for Implementing the Probability of Liquefaction in Performance Based Design Missouri University of Science and Technology Scholars' Mine International Conferences on Recent Advances in Geotechnical Earthquake Engineering and Soil Dynamics 2010 - Fifth International Conference

More information

Chapter 1: Exploring Data

Chapter 1: Exploring Data Chapter 1: Exploring Data Key Vocabulary:! individual! variable! frequency table! relative frequency table! distribution! pie chart! bar graph! two-way table! marginal distributions! conditional distributions!

More information

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al.

Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Comments on Significance of candidate cancer genes as assessed by the CaMP score by Parmigiani et al. Holger Höfling Gad Getz Robert Tibshirani June 26, 2007 1 Introduction Identifying genes that are involved

More information

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Multiple Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University Multiple Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Multiple Regression 1 / 19 Multiple Regression 1 The Multiple

More information

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures 1 2 3 4 5 Kathleen T Quach Department of Neuroscience University of California, San Diego

More information

Mammogram Analysis: Tumor Classification

Mammogram Analysis: Tumor Classification Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is

More information

VU Biostatistics and Experimental Design PLA.216

VU Biostatistics and Experimental Design PLA.216 VU Biostatistics and Experimental Design PLA.216 Julia Feichtinger Postdoctoral Researcher Institute of Computational Biotechnology Graz University of Technology Outline for Today About this course Background

More information

Study of cigarette sales in the United States Ge Cheng1, a,

Study of cigarette sales in the United States Ge Cheng1, a, 2nd International Conference on Economics, Management Engineering and Education Technology (ICEMEET 2016) 1Department Study of cigarette sales in the United States Ge Cheng1, a, of pure mathematics and

More information

Classical Psychophysical Methods (cont.)

Classical Psychophysical Methods (cont.) Classical Psychophysical Methods (cont.) 1 Outline Method of Adjustment Method of Limits Method of Constant Stimuli Probit Analysis 2 Method of Constant Stimuli A set of equally spaced levels of the stimulus

More information

SUPPLEMENTAL MATERIAL

SUPPLEMENTAL MATERIAL 1 SUPPLEMENTAL MATERIAL Response time and signal detection time distributions SM Fig. 1. Correct response time (thick solid green curve) and error response time densities (dashed red curve), averaged across

More information

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys

CRITERIA FOR USE. A GRAPHICAL EXPLANATION OF BI-VARIATE (2 VARIABLE) REGRESSION ANALYSISSys Multiple Regression Analysis 1 CRITERIA FOR USE Multiple regression analysis is used to test the effects of n independent (predictor) variables on a single dependent (criterion) variable. Regression tests

More information

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Predicting Breast Cancer Survival Using Treatment and Patient Factors Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women

More information

Quantitative Evaluation of Edge Detectors Using the Minimum Kernel Variance Criterion

Quantitative Evaluation of Edge Detectors Using the Minimum Kernel Variance Criterion Quantitative Evaluation of Edge Detectors Using the Minimum Kernel Variance Criterion Qiang Ji Department of Computer Science University of Nevada Robert M. Haralick Department of Electrical Engineering

More information

1.4 - Linear Regression and MS Excel

1.4 - Linear Regression and MS Excel 1.4 - Linear Regression and MS Excel Regression is an analytic technique for determining the relationship between a dependent variable and an independent variable. When the two variables have a linear

More information

Assessment of a disease screener by hierarchical all-subset selection using area under the receiver operating characteristic curves

Assessment of a disease screener by hierarchical all-subset selection using area under the receiver operating characteristic curves Research Article Received 8 June 2010, Accepted 15 February 2011 Published online 15 April 2011 in Wiley Online Library (wileyonlinelibrary.com) DOI: 10.1002/sim.4246 Assessment of a disease screener by

More information

PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5

PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science. Homework 5 PSYCH-GA.2211/NEURL-GA.2201 Fall 2016 Mathematical Tools for Cognitive and Neural Science Homework 5 Due: 21 Dec 2016 (late homeworks penalized 10% per day) See the course web site for submission details.

More information