Supersparse Linear Integer Models for Interpretable Prediction. Berk Ustun Stefano Tracà Cynthia Rudin INFORMS 2013
|
|
- Annabelle Warren
- 5 years ago
- Views:
Transcription
1 Supersparse Linear Integer Models for Interpretable Prediction Berk Ustun Stefano Tracà Cynthia Rudin INFORMS 2013
2 CHADS 2 Scoring System Condition Points Congestive heart failure 1 Hypertension 1 Age 75 years 1 Diabetes mellitus 1 Prior Stroke ortia orthromboembolism 2 If Total Points 4 è Predict Stroke
3 What Makes CHADS 2 Interpretable? Condition Points Congestive heart failure 1 Hypertension 1 Age 75 years 1 Diabetes mellitus 1 Prior Stroke or TIA or Thromboembolism 2 Few Terms è Sparse Limited Coefficient Values è Meaningful Sign-constrained Relationships è Intuitive Predictions Without Computers è Easy-to-Use
4 State of Interpretable Classification Less Accurate Medical Scoring Systems Logistic Regression LARS Decision Rules Decision Trees More Accurate Support Vector Machines Random Forests More Interpretable Less Interpretable
5 Today Methodology Designing a Model That is Accurate and Interpretable Performance Comparison to State-of-the-Art Classification Models Medical Application mammo Criminology Application violentcrime
6 Today Methodology Designing a Model That is Accurate and Interpretable Performance Comparison to State-of-the-Art Classification Models Medical Application mammo Criminology Application violentcrime
7 Supersparse Linear Integer Models Setup Linear Model Error Occurs When y ŷ apple 0 y sign( T x) apple 0 y T x apple 0
8 Supersparse Linear Integer Models Objective User-Defined Coefficients Meaningful Coefficients & Ease-of-Use 0-1 Loss L 0 Norm L 1 Norm Accuracy Sparsity Meaningful Coefficients & Ease-of-Use
9 MIP Formulation Given N examples and P features N + 3P variables 2N + 4P constraints Solvable for N ~ and P ~ 100 in minutes
10 Today Methodology Designing a Model That is Accurate and Interpretable Performance Comparison to State-of-the-Art Classification Models Medical Application mammo Criminology Application violentcrime
11 State of Interpretable Classification Less Accurate Medical Scoring Systems Logistic Regression LARS Decision Rules Decision Trees More Accurate Support Vector Machines Random Forests More Interpretable Less Interpretable
12 Performance Goal Compare the accuracy and sparsity of SLIM and other classification models on UCI datasets Models SLIM SVM Random Forest LARS Lasso LARS Ridge LARS Elastic Net CART C5.0 Tree C5.0 Rule Datasets breastcancer internetad spambase haberman mammo tictactoe
13 Accuracy vs. Sparsity for breastcancer
14 SLIM vs. LARS Lasso for breastcancer 25% SLIM Lasso 5 Fold CV Test Error 20% 15% 10% 5% 0% Fold Median L 0 Norm
15 SLIM Model for breastcancer Linear Model Scoring System
16 Today Methodology Designing a Model That is Accurate and Interpretable Performance Comparison to State-of-the-Art Classification Models Medical Application mammo Criminology Application violentcrime
17 Overview of mammo Predict whether a mammographic mass lesion is malignant (Class = +1) N = 961 Examples and P = 11 Features Patient Based (i.e. Age) Cell Characteristics (i.e. Shape, Density)
18 Linear Classifiers for mammo SLIM Test Error: 21.5% Lasso Ridge Test Error: 22.9% Test Error: 22.4%
19 Tree Classifiers for mammo C5.0 Tree SLIM no Is the margin circumscribed? yes no Is the shape oval? yes no Is the shape oval? yes benign no Is the margin circumscribed? yes benign no Is the patient over 60 y.o? yes benign malignant benign no Is the shape irregular? yes malignant benign malignant Test Error: 20.0% Test Error: 21.5%
20 Today Methodology Designing a Model That is Accurate and Interpretable Performance Comparison to State-of-the-Art Classification Models Medical Application mammo Criminology Application violentcrime
21 Overview of violentcrime Predict if a young person raised in out-ofhome care will commit a violent crime over the next 3 years (Class = +1) N = 558 Examples and P = 108 Features Imbalanced (Only 19% of Class = +1)
22 SLIM for Imbalanced Datasets Balanced Objective Imbalanced Objective Error Rate for Positive Outcomes Error Rate for Negative Outcomes
23 SLIM Performance on violentcrime Sensitivity = # True Positives = 69% # Positive Outcomes Specificity = # True Negatives = 44% # Negative Outcomes
24 SLIM Model for violentcrime
25 Conclusions Interpretability is important when models are Designed to be used by humans Used to yield insights for data mining SLIM balances accuracy and interpretability
26 Appendix
27 SLIM Generalization Bound
28 SLIM vs. LARS for mammo 50% 45% SLIM Lasso 5 Fold CV Test Error 40% 35% 30% 25% 20% Fold Median L 0 Norm
29 SLIM vs. LARS for haberman 22% SLIM Lasso 5 Fold CV Test Error 20% 18% 16% Fold Median L 0 Norm
30 SLIM vs. LARS for internetad 8% SLIM Lasso 5 Fold CV Test Error 6% 4% 2% 0% Fold Median L 0 Norm
31 SLIM vs. LARS for spambase 30% SLIM Lasso 5 Fold CV Test Error 25% 20% 15% 10% 5% Fold Median L 0 Norm
32 SLIM vs. LARS for tictactoe 25% SLIM Lasso 5 Fold CV Test Error 20% 15% 10% 5% 0% Fold Median L 0 Norm
33 Accuracy vs. Sparsity: haberman
34 Accuracy vs. Sparsity: mammo
35 Accuracy vs. Sparsity: internetad
36 Accuracy vs. Sparsity: spambase
37 Accuracy vs. Sparsity: tictactoe
38 Computational Performance for breastcancer 100 % 6 % 5 80 % 5 % MIP Gap 60 % 40 % L 0 Norm Fold CV Test Error 4 % 3 % 20 % 1 2 % 0 % Runtime (Minutes) Runtime (Minutes) Runtime (Minutes)
39 Computational Performance for breastcancer 100 % 6 % 5 80 % 5 % MIP Gap 60 % 40 % L 0 Norm Fold CV Test Error 4 % 3 % 20 % 1 2 % 0 % Runtime (Minutes) Runtime (Minutes) Runtime (Minutes)
40 Computational Performance for haberman 100 % 30 % 5 80 % MIP Gap 60 % 40 % L 0 Norm Fold CV Test Error 24 % 18 % 20 % 1 12 % 0 % Runtime (Minutes) Runtime (Minutes) Runtime (Minutes)
41 Computational Performance for internetad 100 % 20 MIP Gap 80 % 60 % 40 % 20 % L 0 Norm Fold CV Test Error 6 % 5 % 4 % 3 % 0 % Runtime (Minutes) Runtime (Minutes) Runtime (Minutes)
42 Computational Performance for mammo 100 % 24 % 80 % 9 MIP Gap 60 % 40 % 20 % L 0 Norm Fold CV Test Error 20 % 0 % Runtime (Minutes) Runtime (Minutes) % Runtime (Minutes)
43 Computational Performance for spambase 100 % 29 9 % 80 % MIP Gap 60 % 40 % 20 % L 0 Norm Fold CV Test Error 8 % 7 % 0 % Runtime (Minutes) Runtime (Minutes) 6 % Runtime (Minutes)
44 Computational Performance for tictactoe 100 % % MIP Gap 80 % 60 % 40 % 20 % L 0 Norm Fold CV Train Error 15 % 10 % 5 % 0 % Runtime (Minutes) Runtime (Minutes) 0 % Runtime (Minutes)
Predicting Potential Domestic Violence Re-offenders Using Machine Learning. Rajhas Balaraman Supervisor : Dr. Timothy Graham
Predicting Potential Domestic Violence Re-offenders Using Machine Learning Rajhas Balaraman Supervisor : Dr. Timothy Graham INTRODUCTION Before we get started, a few definitions : Machine Learning : Branch
More informationClinical Prediction Models for Sleep Apnea: The Importance of Medical History over Symptoms
pii: jc-00172-15 http://dx.doi.org/10.5664/jcsm.5476 SCIENTIFIC INVESTIGATIONS Clinical Prediction Models for Sleep Apnea: The Importance of Medical History over Symptoms Berk Ustun, MS 1 ; M. Brandon
More informationAkosa, Josephine Kelly, Shannon SAS Analytics Day
Application of Data Mining Techniques in Improving Breast Cancer Diagnosis Akosa, Josephine Kelly, Shannon 2016 SAS Analytics Day Facts and Figures about Breast Cancer Methods of Diagnosing Breast Cancer
More informationComputer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California
Computer Age Statistical Inference Algorithms, Evidence, and Data Science BRADLEY EFRON Stanford University, California TREVOR HASTIE Stanford University, California ggf CAMBRIDGE UNIVERSITY PRESS Preface
More informationLearning Classification Models of Cognitive Conditions from Subtle Behaviors in the Digital Clock Drawing Test
Submitted to MLJ special issue for Healthcare and Medicine manuscript No. (will be inserted by the editor) Learning Classification Models of Cognitive Conditions from Subtle Behaviors in the Digital Clock
More informationSparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism
Levy et al. Molecular Autism (2017) 8:65 DOI 10.1186/s13229-017-0180-6 RESEARCH Sparsifying machine learning models identify stable subsets of predictive features for behavioral detection of autism Sebastien
More informationCase Studies of Signed Networks
Case Studies of Signed Networks Christopher Wang December 10, 2014 Abstract Many studies on signed social networks focus on predicting the different relationships between users. However this prediction
More informationarxiv: v1 [cs.lg] 3 Jan 2018
arxiv:1801.01204v1 [cs.lg] 3 Jan 2018 Predicting Chronic Disease Hospitalizations from Electronic Health Records: An Interpretable Classification Approach Theodora S. Brisimi, Tingting Xu, Taiyao Wang,
More informationTesting Statistical Models to Improve Screening of Lung Cancer
Testing Statistical Models to Improve Screening of Lung Cancer 1 Elliot Burghardt: University of Iowa Daren Kuwaye: University of Hawai i at Mānoa Iowa Summer Institute in Biostatistics - University of
More informationA Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction
A Novel Iterative Linear Regression Perceptron Classifier for Breast Cancer Prediction Samuel Giftson Durai Research Scholar, Dept. of CS Bishop Heber College Trichy-17, India S. Hari Ganesh, PhD Assistant
More informationAn Improved Algorithm To Predict Recurrence Of Breast Cancer
An Improved Algorithm To Predict Recurrence Of Breast Cancer Umang Agrawal 1, Ass. Prof. Ishan K Rajani 2 1 M.E Computer Engineer, Silver Oak College of Engineering & Technology, Gujarat, India. 2 Assistant
More informationIdentification of Tissue Independent Cancer Driver Genes
Identification of Tissue Independent Cancer Driver Genes Alexandros Manolakos, Idoia Ochoa, Kartik Venkat Supervisor: Olivier Gevaert Abstract Identification of genomic patterns in tumors is an important
More informationLearning classification models of cognitive conditions from subtle behaviors in the digital Clock Drawing Test
Mach Learn (2016) 102:393 441 DOI 10.1007/s10994-015-5529-5 Learning classification models of cognitive conditions from subtle behaviors in the digital Clock Drawing Test William Souillard-Mandar 1 Randall
More informationAnalysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School November 2015 Analysis of Rheumatoid Arthritis Data using Logistic Regression and Penalized Approach Wei Chen
More informationWhat is Regularization? Example by Sean Owen
What is Regularization? Example by Sean Owen What is Regularization? Name3 Species Size Threat Bo snake small friendly Miley dog small friendly Fifi cat small enemy Muffy cat small friendly Rufus dog large
More informationArticle from. Forecasting and Futurism. Month Year July 2015 Issue Number 11
Article from Forecasting and Futurism Month Year July 2015 Issue Number 11 Calibrating Risk Score Model with Partial Credibility By Shea Parkes and Brad Armstrong Risk adjustment models are commonly used
More informationBREAST CANCER EPIDEMIOLOGY MODEL:
BREAST CANCER EPIDEMIOLOGY MODEL: Calibrating Simulations via Optimization Michael C. Ferris, Geng Deng, Dennis G. Fryback, Vipat Kuruchittham University of Wisconsin 1 University of Wisconsin Breast Cancer
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationUtilizing Posterior Probability for Race-composite Age Estimation
Utilizing Posterior Probability for Race-composite Age Estimation Early Applications to MORPH-II Benjamin Yip NSF-REU in Statistical Data Mining and Machine Learning for Computer Vision and Pattern Recognition
More informationFeature selection methods for early predictive biomarker discovery using untargeted metabolomic data
Feature selection methods for early predictive biomarker discovery using untargeted metabolomic data Dhouha Grissa, Mélanie Pétéra, Marion Brandolini, Amedeo Napoli, Blandine Comte and Estelle Pujos-Guillot
More informationDPPred: An Effective Prediction Framework with Concise Discriminative Patterns
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, MANUSCRIPT ID DPPred: An Effective Prediction Framework with Concise Discriminative Patterns Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian
More informationPart [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals
Part [2.1]: Evaluation of Markers for Treatment Selection Linking Clinical and Statistical Goals Patrick J. Heagerty Department of Biostatistics University of Washington 174 Biomarkers Session Outline
More informationEMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS?
EMOTION CLASSIFICATION: HOW DOES AN AUTOMATED SYSTEM COMPARE TO NAÏVE HUMAN CODERS? Sefik Emre Eskimez, Kenneth Imade, Na Yang, Melissa Sturge- Apple, Zhiyao Duan, Wendi Heinzelman University of Rochester,
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationApplying One-vs-One and One-vs-All Classifiers in k-nearest Neighbour Method and Support Vector Machines to an Otoneurological Multi-Class Problem
Oral Presentation at MIE 2011 30th August 2011 Oslo Applying One-vs-One and One-vs-All Classifiers in k-nearest Neighbour Method and Support Vector Machines to an Otoneurological Multi-Class Problem Kirsi
More informationABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India
International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 1 ISSN : 2456-3307 Data Mining Techniques to Predict Cancer Diseases
More informationComparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka
I J C T A, 10(8), 2017, pp. 59-67 International Science Press ISSN: 0974-5572 Comparative Analysis of Machine Learning Algorithms for Chronic Kidney Disease Detection using Weka Milandeep Arora* and Ajay
More informationThe Relationship between Crime and CCTV Installation Status by Using Artificial Neural Networks
, pp.150-157 http://dx.doi.org/10.14257/astl.2016.139.34 The Relationship between Crime and CCTV Installation Status by Using Artificial Neural Networks Ahyoung Jung 1, Changjae Kim 2, Dept. S/W Engr.
More informationDETECTION AND CLASSIFICATION OF MICROCALCIFICATION USING SHEARLET WAVE TRANSFORM
DETECTION AND CLASSIFICATION OF MICROCALCIFICATION USING Ms.Saranya.S 1, Priyanga. R 2, Banurekha. B 3, Gayathri.G 4 1 Asst. Professor,Electronics and communication,panimalar Institute of technology, Tamil
More informationExploiting Ordinality in Predicting Star Reviews
Exploiting Ordinality in Predicting Star Reviews Alim Virani UBC - Computer Science alim.virani@gmail.com Chris Cameron UBC - Computer Science cchris13@cs.ubc.ca Abstract Automatically evaluating the sentiment
More informationMayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department
Data Mining Techniques to Find Out Heart Diseases: An Overview Mayuri Takore 1, Prof.R.R. Shelke 2 1 ME First Yr. (CSE), 2 Assistant Professor Computer Science & Engg, Department H.V.P.M s COET, Amravati
More informationDiagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods
International Journal of Bioinformatics and Biomedical Engineering Vol. 1, No. 3, 2015, pp. 318-322 http://www.aiscience.org/journal/ijbbe ISSN: 2381-7399 (Print); ISSN: 2381-7402 (Online) Diagnosis of
More informationClassification of mammogram masses using selected texture, shape and margin features with multilayer perceptron classifier.
Biomedical Research 2016; Special Issue: S310-S313 ISSN 0970-938X www.biomedres.info Classification of mammogram masses using selected texture, shape and margin features with multilayer perceptron classifier.
More informationGeneralized additive model for disease risk prediction
Generalized additive model for disease risk prediction Guodong Chen Chu Kochen Honors College, Zhejiang University Channing Division of Network Medicine, BWH & HMS Advised by: Prof. Yang-Yu Liu 1 Is it
More informationTroponin I elevation increases the risk of death and stroke in patients with atrial fibrillation a RE-LY substudy. Ziad Hijazi, MD
Troponin I elevation increases the risk of death and stroke in patients with atrial fibrillation a RE-LY substudy Ziad Hijazi, MD Uppsala Clinical Research Center (UCR) Uppsala University, Sweden Co-authors:
More informationRating prediction on Amazon Fine Foods Reviews
Rating prediction on Amazon Fine Foods Reviews Chen Zheng University of California,San Diego chz022@ucsd.edu Ye Zhang University of California,San Diego yez033@ucsd.edu Yikun Huang University of California,San
More informationBuilding Interpretable Classifiers with Rules using Bayesian Analysis
Building Interpretable Classifiers with Rules using Bayesian Analysis Benjamin Letham MIT Tyler H. McCormick University of Washington Cynthia Rudin MIT David Madigan Columbia University Technical Report
More informationEvaluating Classifiers for Disease Gene Discovery
Evaluating Classifiers for Disease Gene Discovery Kino Coursey Lon Turnbull khc0021@unt.edu lt0013@unt.edu Abstract Identification of genes involved in human hereditary disease is an important bioinfomatics
More informationPanel: Machine Learning in Surgery and Cancer
Panel: Machine Learning in Surgery and Cancer Professor Dimitris Bertsimas, SM 87, PhD 88, Boeing Leaders for Global Operations Professor of Management; Professor of Operations Research; Co-Director, Operations
More informationCocktail Preference Prediction
Cocktail Preference Prediction Linus Meyer-Teruel, 1 Michael Parrott 1 1 Department of Computer Science, Stanford University, In this paper we approach the problem of rating prediction on data from a number
More informationPredicting Non-Small Cell Lung Cancer Diagnosis and Prognosis by Fully Automated Microscopic Pathology Image Features
Predicting Non-Small Cell Lung Cancer Diagnosis and Prognosis by Fully Automated Microscopic Pathology Image Features Kun-Hsing Yu, MD, PhD Department of Biomedical Informatics, Harvard Medical School
More informationVariable Features Selection for Classification of Medical Data using SVM
Variable Features Selection for Classification of Medical Data using SVM Monika Lamba USICT, GGSIPU, Delhi, India ABSTRACT: The parameters selection in support vector machines (SVM), with regards to accuracy
More informationMethods for Predicting Type 2 Diabetes
Methods for Predicting Type 2 Diabetes CS229 Final Project December 2015 Duyun Chen 1, Yaxuan Yang 2, and Junrui Zhang 3 Abstract Diabetes Mellitus type 2 (T2DM) is the most common form of diabetes [WHO
More informationPredicting Intensive Care Unit (ICU) Length of Stay (LOS) Via Support Vector Machine (SVM) Regression
Predicting Intensive Care Unit (ICU) Length of Stay (LOS) Via Support Vector Machine (SVM) Regression Morgan Cheatham CLPS1211: Human and Machine Learning December 15, 2015 INTRODUCTION Extended inpatient
More informationJournal of Biomedical Informatics
Journal of Biomedical Informatics 56 (2015) 229 238 Contents lists available at ScienceDirect Journal of Biomedical Informatics journal homepage: www.elsevier.com/locate/yjbin A comparison of models for
More informationRajiv Gandhi College of Engineering, Chandrapur
Utilization of Data Mining Techniques for Analysis of Breast Cancer Dataset Using R Keerti Yeulkar 1, Dr. Rahila Sheikh 2 1 PG Student, 2 Head of Computer Science and Studies Rajiv Gandhi College of Engineering,
More informationClassıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon Network
UTM Computing Proceedings Innovations in Computing Technology and Applications Volume 2 Year: 2017 ISBN: 978-967-0194-95-0 1 Classıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon
More informationChemometrics for Analysis of NIR Spectra on Pharmaceutical Oral Dosages
Chemometrics for Analysis of NIR Spectra on Pharmaceutical Oral Dosages William Welsh, Sastry Isukapalli, Rodolfo Romañach, Bozena Kohn-Michniak, Alberto Cuitino, Fernando Muzzio NATIONAL SCIENCE FOUNDATION
More informationClassification. Methods Course: Gene Expression Data Analysis -Day Five. Rainer Spang
Classification Methods Course: Gene Expression Data Analysis -Day Five Rainer Spang Ms. Smith DNA Chip of Ms. Smith Expression profile of Ms. Smith Ms. Smith 30.000 properties of Ms. Smith The expression
More informationNature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.
Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze
More informationMachine Learning to Inform Breast Cancer Post-Recovery Surveillance
Machine Learning to Inform Breast Cancer Post-Recovery Surveillance Final Project Report CS 229 Autumn 2017 Category: Life Sciences Maxwell Allman (mallman) Lin Fan (linfan) Jamie Kang (kangjh) 1 Introduction
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Term Project Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is the
More informationApplying Machine Learning Methods in Medical Research Studies
Applying Machine Learning Methods in Medical Research Studies Daniel Stahl Department of Biostatistics and Health Informatics Psychiatry, Psychology & Neuroscience (IoPPN), King s College London daniel.r.stahl@kcl.ac.uk
More informationMammogram Analysis: Tumor Classification
Mammogram Analysis: Tumor Classification Literature Survey Report Geethapriya Raghavan geeragh@mail.utexas.edu EE 381K - Multidimensional Digital Signal Processing Spring 2005 Abstract Breast cancer is
More informationPredicting Malignancy from Mammography Findings and Image Guided Core Biopsies
Predicting Malignancy from Mammography Findings and Image Guided Core Biopsies 2 nd Breast Cancer Workshop 2015 April 7 th 2015 Porto, Portugal Pedro Ferreira Nuno A. Fonseca Inês Dutra Ryan Woods Elizabeth
More informationarxiv: v2 [stat.ap] 7 Dec 2016
A Bayesian Approach to Predicting Disengaged Youth arxiv:62.52v2 [stat.ap] 7 Dec 26 David Kohn New South Wales 26 david.kohn@sydney.edu.au Nick Glozier Brain Mind Centre New South Wales 26 Sally Cripps
More informationIdentifying Relevant micrornas in Bladder Cancer using Multi-Task Learning
Identifying Relevant micrornas in Bladder Cancer using Multi-Task Learning Adriana Birlutiu (1),(2), Paul Bulzu (1), Irina Iereminciuc (1), and Alexandru Floares (1) (1) SAIA & OncoPredict, Cluj-Napoca,
More informationSurvey on Prediction and Analysis the Occurrence of Heart Disease Using Data Mining Techniques
Volume 118 No. 8 2018, 165-174 ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu Survey on Prediction and Analysis the Occurrence of Heart Disease Using
More informationDiscovery and Clinical Decision Support for Personalized Healthcare
1 Discovery and Clinical Decision Support for Personalized Healthcare Jinsung Yoon, Camelia Davtyan, MD, FACP, and Mihaela van der Schaar, Fellow, IEEE Abstract With the advent of electronic health records,
More informationOBSERVATIONAL MEDICAL OUTCOMES PARTNERSHIP
OBSERVATIONAL Patient-centered observational analytics: New directions toward studying the effects of medical products Patrick Ryan on behalf of OMOP Research Team May 22, 2012 Observational Medical Outcomes
More informationPrediction of Malignant and Benign Tumor using Machine Learning
Prediction of Malignant and Benign Tumor using Machine Learning Ashish Shah Department of Computer Science and Engineering Manipal Institute of Technology, Manipal University, Manipal, Karnataka, India
More informationMining Discriminative Patterns to Predict Health Status for Cardiopulmonary Patients
Mining Discriminative Patterns to Predict Health Status for Cardiopulmonary Patients Qian Cheng, Jingbo Shang, Joshua Juen, Jiawei Han and Bruce Schatz Department of Computer Science Department of Electrical
More informationReview: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections
Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections New: Bias-variance decomposition, biasvariance tradeoff, overfitting, regularization, and feature selection Yi
More informationLarge-Scale Statistical Modelling via Machine Learning Classifiers
J. Stat. Appl. Pro. 2, No. 3, 203-222 (2013) 203 Journal of Statistics Applications & Probability An International Journal http://dx.doi.org/10.12785/jsap/020303 Large-Scale Statistical Modelling via Machine
More informationStage-Specific Predictive Models for Cancer Survivability
University of Wisconsin Milwaukee UWM Digital Commons Theses and Dissertations December 2016 Stage-Specific Predictive Models for Cancer Survivability Elham Sagheb Hossein Pour University of Wisconsin-Milwaukee
More informationSurvey on clinical prediction models for diabetes prediction
DOI 10.1186/s40537-017-0082-7 SURVEY PAPER Open Access Survey on clinical prediction models for diabetes prediction N. Jayanthi 1,3*, B. Vijaya Babu 1 and N. Sambasiva Rao 2 *Correspondence: jneelampalli.phd@gmail.com
More informationWard Headstrom Institutional Research Humboldt State University CAIR
Using a Random Forest model to predict enrollment Ward Headstrom Institutional Research Humboldt State University CAIR 2013 1 Overview Forecasting enrollment to assist University planning The R language
More information7.1 Grading Diabetic Retinopathy
Chapter 7 DIABETIC RETINOPATHYGRADING -------------------------------------------------------------------------------------------------------------------------------------- A consistent approach to the
More informationReliability of Ordination Analyses
Reliability of Ordination Analyses Objectives: Discuss Reliability Define Consistency and Accuracy Discuss Validation Methods Opening Thoughts Inference Space: What is it? Inference space can be defined
More informationMACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE
Abstract MACHINE LEARNING BASED APPROACHES FOR PREDICTION OF PARKINSON S DISEASE Arvind Kumar Tiwari GGS College of Modern Technology, SAS Nagar, Punjab, India The prediction of Parkinson s disease is
More informationRISK PREDICTION MODEL: PENALIZED REGRESSIONS
RISK PREDICTION MODEL: PENALIZED REGRESSIONS Inspired from: How to develop a more accurate risk prediction model when there are few events Menelaos Pavlou, Gareth Ambler, Shaun R Seaman, Oliver Guttmann,
More informationImproved Hepatic Fibrosis Grading Using Point Shear Wave Elastography and Machine Learning
Improved Hepatic Fibrosis Grading Using Point Shear Wave Elastography and Machine Learning Presenter: Hersh Sagreiya 1, M.D. Authors: Alireza Akhbardeh 1, Ph.D., Isabelle Durot 1, M.D., Carlo Filice 2,
More informationMulti Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 *
Multi Parametric Approach Using Fuzzification On Heart Disease Analysis Upasana Juneja #1, Deepti #2 * Department of CSE, Kurukshetra University, India 1 upasana_jdkps@yahoo.com Abstract : The aim of this
More informationAnalyzing diastolic and systolic blood pressure individually or jointly?
Analyzing diastolic and systolic blood pressure individually or jointly? Chenglin Ye a, Gary Foster a, Lisa Dolovich b, Lehana Thabane a,c a. Department of Clinical Epidemiology and Biostatistics, McMaster
More informationAustralian Journal of Basic and Applied Sciences
ISSN:1991-8178 Australian Journal of Basic and Applied Sciences Journal home page: www.ajbasweb.com Improved Accuracy of Breast Cancer Detection in Digital Mammograms using Wavelet Analysis and Artificial
More informationSTATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION XIN SUN. PhD, Kansas State University, 2012
STATISTICAL METHODS FOR DIAGNOSTIC TESTING: AN ILLUSTRATION USING A NEW METHOD FOR CANCER DETECTION by XIN SUN PhD, Kansas State University, 2012 A THESIS Submitted in partial fulfillment of the requirements
More informationIntegrated Machine Learning Approaches for Predicting Ischemic Stroke and Thromboembolism in Atrial Fibrillation
Integrated Machine Learning Approaches for Predicting Ischemic Stroke and Thromboembolism in Atrial Fibrillation Xiang Li, PhD 1, Haifeng Liu, PhD 1, Xin Du, MD 2, Ping Zhang, PhD 3, Gang Hu 1, Guotong
More informationBLOOD GLUCOSE PREDICTION MODELS FOR PERSONALIZED DIABETES MANAGEMENT
BLOOD GLUCOSE PREDICTION MODELS FOR PERSONALIZED DIABETES MANAGEMENT A Thesis Submitted to the Graduate Faculty of the North Dakota State University of Agriculture and Applied Science By Warnakulasuriya
More informationCopyright 2016 Aftab Hassan
Copyright 2016 Aftab Hassan Predictive Analytics and Decision Support for Heart Failure patients Aftab Hassan A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science
More informationVARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA
VARIABLE SELECTION WHEN CONFRONTED WITH MISSING DATA by Melissa L. Ziegler B.S. Mathematics, Elizabethtown College, 2000 M.A. Statistics, University of Pittsburgh, 2002 Submitted to the Graduate Faculty
More informationCSE 258 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression
CSE 258 Lecture 2 Web Mining and Recommender Systems Supervised learning Regression Supervised versus unsupervised learning Learning approaches attempt to model data in order to solve a problem Unsupervised
More informationMachine learning (ML) has recently received considerable
Interpretable machine learning: definitions, methods, and applications W. James Murdoch a,1, Chandan Singh b,1, Karl Kumbier a,2, Reza Abbasi-Asl b,c,2, and Bin Yu a,b a UC Berkeley Statistics Dept.; b
More informationThe Long Tail of Recommender Systems and How to Leverage It
The Long Tail of Recommender Systems and How to Leverage It Yoon-Joo Park Stern School of Business, New York University ypark@stern.nyu.edu Alexander Tuzhilin Stern School of Business, New York University
More informationThe EuResist GEIE data base
Prediction Of Antiretroviral Therapy Outcomes In Poor Resource Countries: Comparison Between Genotype Resistance Testing Based vs. Treatment History Models MCFProsperi 1,MRosen Zvi 2,AAltmann 3,EAharoni
More informationInternational Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017
RESEARCH ARTICLE Classification of Cancer Dataset in Data Mining Algorithms Using R Tool P.Dhivyapriya [1], Dr.S.Sivakumar [2] Research Scholar [1], Assistant professor [2] Department of Computer Science
More informationPredicting Breast Cancer Survival Using Treatment and Patient Factors
Predicting Breast Cancer Survival Using Treatment and Patient Factors William Chen wchen808@stanford.edu Henry Wang hwang9@stanford.edu 1. Introduction Breast cancer is the leading type of cancer in women
More informationDoes Machine Learning. In a Learning Health System?
Does Machine Learning Have a Place In a Learning Health System? Grand Rounds: Rethinking Clinical Research Friday, December 15, 2017 Michael J. Pencina, PhD Professor of Biostatistics and Bioinformatics,
More informationPredicting About-to-Eat Moments for Just-in-Time Eating Intervention
Predicting About-to-Eat Moments for Just-in-Time Eating Intervention CORNELL UNIVERSITY AND VIBE GROUP AT MICROSOFT RESEARCH Motivation Obesity is a leading cause of preventable death second only to smoking,
More informationA Deep Learning Approach to Identify Diabetes
, pp.44-49 http://dx.doi.org/10.14257/astl.2017.145.09 A Deep Learning Approach to Identify Diabetes Sushant Ramesh, Ronnie D. Caytiles* and N.Ch.S.N Iyengar** School of Computer Science and Engineering
More informationChapter 11: Advanced Remedial Measures. Weighted Least Squares (WLS)
Chapter : Advanced Remedial Measures Weighted Least Squares (WLS) When the error variance appears nonconstant, a transformation (of Y and/or X) is a quick remedy. But it may not solve the problem, or it
More informationCombined Radiology and Pathology Classification of Brain Tumors
Combined Radiology and Pathology Classification of Brain Tumors Rozpoznanie guza mózgu na podstawie obrazu radiologicznego i patologicznego Piotr Giedziun Supervisor: dr hab. inż. Henryk Maciejewski 4
More informationData Mining in Bioinformatics Day 7: Clustering in Bioinformatics
Data Mining in Bioinformatics Day 7: Clustering in Bioinformatics Karsten Borgwardt February 21 to March 4, 2011 Machine Learning & Computational Biology Research Group MPIs Tübingen Karsten Borgwardt:
More informationPredicting Breast Cancer Recurrence Using Machine Learning Techniques
Predicting Breast Cancer Recurrence Using Machine Learning Techniques Umesh D R Department of Computer Science & Engineering PESCE, Mandya, Karnataka, India Dr. B Ramachandra Department of Electrical and
More informationISSN Vol.03,Issue.06, May-2014, Pages:
www.semargroup.org, www.ijsetr.com ISSN 2319-8885 Vol.03,Issue.06, May-2014, Pages:0920-0926 Breast Cancer Classification with Statistical Features of Wavelet Coefficient of Mammograms SHITAL LAHAMAGE
More informationA NOVEL CLASSIFICATION MODEL FOR ANALYSIS OF A CRIME USING NAÏVE BYES AND KNN IN DATA MINING
A NOVEL CLASSIFICATION MODEL FOR ANALYSIS OF A CRIME USING NAÏVE BYES AND KNN IN DATA MINING SHIVRAJ SINGH DEOPA 1, ABHISHEK KUMAR 2, KUNEEK GUPTA 3 Dr. SHASHI KANT SINGH 4 Galgotias college of engineering
More informationSelection and Combination of Markers for Prediction
Selection and Combination of Markers for Prediction NACC Data and Methods Meeting September, 2010 Baojiang Chen, PhD Sarah Monsell, MS Xiao-Hua Andrew Zhou, PhD Overview 1. Research motivation 2. Describe
More informationAtrial Fibrillation Implementation challenges. Lesley Edgar Ross Maconachie
Atrial Fibrillation Implementation challenges Lesley Edgar Ross Maconachie Atrial Fibrillation Most common heart rhythm disturbance Rapid and irregular electrical signals Reduced efficiency of blood flow
More informationAN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES RISK IN PRIMARY CARE
Proceedings of the 3rd INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2008) J. Li, D. Aleman, R. Sikora, eds. AN INFORMATION VISUALIZATION APPROACH TO CLASSIFICATION AND ASSESSMENT OF DIABETES
More informationRadiotherapy Outcomes
in partnership with Outcomes Models with Machine Learning Sarah Gulliford PhD Division of Radiotherapy & Imaging sarahg@icr.ac.uk AAPM 31 st July 2017 Making the discoveries that defeat cancer Radiotherapy
More information