Panel: Machine Learning in Surgery and Cancer

Similar documents
THE FUTURE OF OR. Dimitris Bertsimas MIT

Predicting Breast Cancer Survival Using Treatment and Patient Factors

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Anirudh Kamath 1, Raj Ramnani 2, Jay Shenoy 3, Aditya Singh 4, and Ayush Vyas 5 arxiv: v2 [stat.ml] 15 Aug 2017.

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer

Generalized additive model for disease risk prediction

Does Machine Learning. In a Learning Health System?

Predicting Breast Cancer Survivability Rates

Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods*

Patient Safety in the Age of Big Data

Classification and Predication of Breast Cancer Risk Factors Using Id3

Following the health of half a million participants

Data Fusion: Integrating patientreported survey data and EHR data for health outcomes research

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

The Good News. More storage capacity allows information to be saved Economic and social forces creating more aggregation of data

Identifying Parkinson s Patients: A Functional Gradient Boosting Approach

Predictive Model for Detection of Colorectal Cancer in Primary Care by Analysis of Complete Blood Counts

Application of AI in Healthcare. Alistair Erskine MD MBA Chief Informatics Officer

Prognostic Tools Compare the Models

THE ANALYTICS EDGE. Intelligence, Happiness, and Health x The Analytics Edge

Stereotactic Ablative Radiotherapy for Prostate Cancer

The EuResist GEIE data base

Stage-Specific Predictive Models for Cancer Survivability

What is an Adequate Lumpectomy Margin in 2018?

Clinical Decision Support Technologies for Oncologic Imaging

International Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods

Personalized Therapy for Prostate Cancer due to Genetic Testings

Estimating Likelihood of Having a BRCA Gene Mutation Based on Family History of Cancers and Recommending Optimized Cancer Preventive Actions

Predictive Models for Healthcare Analytics

Big Data & Predictive Analytics Case Studies: Applying data science to human data Big-Data.AI Summit

Probabilistic Reasoning for Medical Decision Support. Omolola Ogunyemi, PhD Director, Center for Biomedical Informatics Charles Drew University

An Analytics Approach to Designing Combination Chemotherapy Regimens for Cancer

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

Data Sharing Consortiums and Large Datasets to Inform Cancer Diagnosis

Testing Statistical Models to Improve Screening of Lung Cancer

doi: /

Applications of Machine learning in Prediction of Breast Cancer Incidence and Mortality

An Improved Patient-Specific Mortality Risk Prediction in ICU in a Random Forest Classification Framework

Historical Perspective

Artificial Intelligence in Breast Imaging

Confluence: Conformity Influence in Large Social Networks

The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care

Model-free machine learning methods for personalized breast cancer risk prediction -SWISS PROMPT

Statistics 202: Data Mining. c Jonathan Taylor. Final review Based in part on slides from textbook, slides of Susan Holmes.

Innovative Risk and Quality Solutions for Value-Based Care. Company Overview

Evaluating Classifiers for Disease Gene Discovery

Introduction to Personalized Cancer Care

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

Prospective Clinical Study of Circulating Tumor Cells For Colorectal Cancer Screening

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Integrating Palliative and Oncology Care in Patients with Advanced Cancer

AVENIO ctdna Analysis Kits The complete NGS liquid biopsy solution EMPOWER YOUR LAB

MAKING THE NSQIP PARTICIPANT USE DATA FILE (PUF) WORK FOR YOU

Financial Disclosure. Learning Objectives. Introduction to Personalized Cancer Care

BayesRandomForest: An R

A Population-Based Study on the Uptake and Utilization of Stereotactic Radiosurgery (SRS) for Brain Metastasis in Nova Scotia

Downloaded from ijbd.ir at 19: on Friday March 22nd (Naive Bayes) (Logistic Regression) (Bayes Nets)

Efficient Feature Extraction and Classification Methods in Neural Interfaces

National Academies Next Generation SAMPLE Researchers TITLE Initiative HERE

BREAST CANCER EPIDEMIOLOGY MODEL:

Augmented Medical Decisions

AVENIO family of NGS oncology assays ctdna and Tumor Tissue Analysis Kits

Smarter Big Data for a Healthy Pennsylvania: Changing the Paradigm of Healthcare

Supersparse Linear Integer Models for Interpretable Prediction. Berk Ustun Stefano Tracà Cynthia Rudin INFORMS 2013

30-DAY HOSPITAL READMISSION PREDICTION MODELS: DESIGN, PERFORMANCE and GENERALIZABILITY

Cardiac Arrest Prediction to Prevent Code Blue Situation

Report on Cancer Statistics in Alberta. Kidney Cancer

Big Image-Omics Data Analytics for Clinical Outcome Prediction

Modelling and Application of Logistic Regression and Artificial Neural Networks Models

Class discovery in Gene Expression Data: Characterizing Splits by Support Vector Machines

Report on Cancer Statistics in Alberta. Melanoma of the Skin

Cancer Immunotherapy Artificial Intelligence: Market Shares, Strategies, and Forecasts, 2017 to 2023

Predicting Kidney Cancer Survival from Genomic Data

Data Mining Techniques to Predict Survival of Metastatic Breast Cancer Patients

Stepwise method Modern Model Selection Methods Quantile-Quantile plot and tests for normality

Personalized, Evidence-based, Outcome-driven Healthcare Empowered by IBM Cognitive Computing Technologies. Guotong Xie IBM Research - China

DIABETIC RISK PREDICTION FOR WOMEN USING BOOTSTRAP AGGREGATION ON BACK-PROPAGATION NEURAL NETWORKS

Arbiter of high-quality cancer care

Interprofessional Webinar Series

Leveraging Pharmacy Medical Records To Predict Diabetes Using A Random Forest & Artificial Neural Network

Predicting Potential Domestic Violence Re-offenders Using Machine Learning. Rajhas Balaraman Supervisor : Dr. Timothy Graham

Selection and Combination of Markers for Prediction

Report on Cancer Statistics in Alberta. Breast Cancer

Akosa, Josephine Kelly, Shannon SAS Analytics Day

Approaches to Predictive Modeling for Palliative or Hospice Care Management

Predicting Sleep Using Consumer Wearable Sensing Devices

Population risk stratification in outcome analysis - approaches and innovative solutions

Big Data and Machine Learning in RCTs An overview

Large-Scale Statistical Modelling via Machine Learning Classifiers

Summary of main challenges and future directions

Use of Archived Tissues in the Development and Validation of Prognostic & Predictive Biomarkers

'Automated dermatologist' detects skin cancer with expert accuracy - CNN.com

The Researcher, Physician, Regulator, and Patient in an Age of Personalized Medicine

Rajiv Gandhi College of Engineering, Chandrapur

Chemometrics for Analysis of NIR Spectra on Pharmaceutical Oral Dosages

Radiotherapy Outcomes

The Impact of a Mouthwash Program on the Risk of Nosocomial Pneumonia at the ICU

Malignant Tumor Detection Using Machine Learning through Scikit-learn

Transcription:

Panel: Machine Learning in Surgery and Cancer Professor Dimitris Bertsimas, SM 87, PhD 88, Boeing Leaders for Global Operations Professor of Management; Professor of Operations Research; Co-Director, Operations Research Center Jack Dunn, PhD candidate at the MIT Operations Research Center George Velmahos, MD, PhD, Division Chief of Trauma, Emergency Surgery and Surgical Critical Care at MGH Daisy Zhuo, PhD candidate at the MIT Operations Research Center #MITSloanHSI

An Actionable Tool for Cancer Mortality Prediction Ying Daisy Zhuo, Operations Research Center, MIT Joint work with Dimitris Bertsimas, Ph.D, Jack Dunn, Colin Pawlowski, John Silberholz, Ph.D, Alex Weinstein, Ph.D, Eddy Chen, M.D., and Aymen Elfiky, M.D.

Cancer Mortality: To Treat or not Treat? Many terminal cancer patients are treated aggressively, despite high risk of short-term mortality Clinicians often over-estimate the prognosis Inaccurate estimates can precipitate decisions that lead to Increased (re)hospitalization Toxicities and lower quality of life Higher health care costs Need accurate prognostic model for better medical decisions

Existing Models

Existing Models Model Populati on Palliative Prognostic Score Terminal (PaP), exponential cancer multiple regression Palliative Performance Index Terminal (PPI), multiple cancer regression Memorial Sloan Metastati Kettering Cancer c prostate Centre nomograms, cancer accelerated failure after time model castration Cancer Prognostic Score, Cox regression model Intra-hospital Cancer Mortality Risk Model (ICMRM), multivariate logistic regression Terminal cancer Hospitali zed cancer patients Inoperabl Glasgow Prognostic e nonsmall-cell Score (GPS), Cox regression model lung cancer Objective Prognostic Score (OPS), Cox regression model Artificial neural networks Terminal cancer Nonsmall-cell lung cancer Support vector All machine ensemble cancers Bayesian network Artificial neural networks Breast cancer Breast and colorectal cancers Graph based semisupervised learning Breast cancer and others Decision trees and others Breast cancer Training Covariat Sample e Size Size 519 150 Data Model AUC Reference Demographic s, cancer 36, status and Not reported selected 6 treatment, symptoms, labs Demographic 25, s, cancer Not reported selected 5 status, symptoms 409 7 total 356 334 Pirovano 1999 Morita 1999 Demographic s, symptoms, Not reported Smaletz 2002 labs Demographic s, cancer 26, status and Not reported Chuang selected 8 2004 treatment, symptoms Demographic s, cancer 0.82 (intrahospital 14, status and selected 5 treatment, mortality) symptoms, labs Bozcuk 2004 161 10 total Demographic s, cancer status and Not reported Forrest treatment, 2003 symptoms, labs Demographic s, cancer 209 17, status and selected 7 treatment, Not reported Suh 2009 symptoms, labs Demographic Thousand s, cancer 440 s of status and Not reported Chen variables, treatment, 2014 selected 5 gene expression Demographic s, cancer 0.76 (2-year status survival), 869 18 total (including 0.80 (1-year Gupta receptor status survival), 2014 for breast 0.87 (6-month cancer survival) patients) Demographic 0.85 78 7 clinical, s, cancer (predicting Gevaert 232 genes status, gene good 2006 expression prognosis) 5,169 21 Demographic 0.78/0.87 (5- (breast), (breast), s, cancer Burke 5,007 32 status and (colorectal ) (colorecta treatment, l) symptoms Demographic s, cancer status and treatment 40,000 16 total 202,932 16 total year survival: breast / colorectal) 0.81 (5-year survival) 1997 Kim 2013 Demographic s, cancer Not reported Delen status and 2005 treatment Issues Data: Small patient population Large scale registry data, but no detailed patient info Methods: ML method not suitable for the large scale Uninterpretable, no clinical meaning to outputs

Our Approach Data EHR of more than 23,000 patients at Dana Farber and Brigham Women from 2004-2014 Predict 60-, 90-, 180-day mortality Detailed 401 variables on: Demographics Medical history Treatment history Lab tests Genetic mutations Novel longitudinal modeling Change in weight, etc. Method Data preparations: Optimal Missing Data Imputation * Demonstrated improvements in imputation quality and downstream tasks Predictions: Optimal Classification Trees Highly accurate Interpretable Facilitate discussion with oncologists Comparisons against Logistic regressions Regularized LR CART Gradient boosted trees * Bertsimas, Pawlowski, and Zhuo. From Predictive Methods to Missing Data Imputation: An Optimization Approach. In revision for submission. Bertsimas and Dunn. Optimal Classification Trees. Machine Learning, 2017: 1-44.

Decision trees are good fit for this health care application: Interpretable and transparent Treatment guidelines are structured in the format of decision trees Optimal Classification Trees Example treatment pathway for invasive breast cancer. NCCN Guideline Version 2.2017

Optimal Classification Trees Decision trees are good fit for this health care application: Interpretable and transparent Treatment guidelines are structured in the format of decision trees Captures non-linear relationship across variables Figure. Mortality rates by weight change groups.

Optimal Classification Trees CART: greedy one-step growing procedure leads to complications: splits are only locally-optimal, overall tree could be far from optimal What if we could solve the entire decision problem at once to find globally optimal trees instead? To date, no globally optimal decision tree method is tractable and scalable to the typical problem sizes Current development by two co-authors provides a solution: Bertsimas and Dunn. Optimal Classification Trees. Machine Learning, 2017: 1-44.

Optimal Classification Trees has significant improvement over CART and competitive with random forest / XGBoost in many real world examples: Optimal Classification Trees Figure. Out-of-sample accuracy across 60 real world datasets.

Results

Tree for 60- day Mortality, Breast Cancer Tool available at: https://stuff.mit.edu/~zhuo/tree_vis/index.html

Questionnaire for Physicians Tool available at: https://stuff.mit.edu/~zhuo/tree_vis/index.html

Model Comparisons Optimal Classification Trees achieved one the highest performances in mortality predictions compared to other state-of-the-art methods

Optimal Classification Trees achieved one the highest performances in mortality predictions compared to other state-of-the-art methods Model Comparisons 60-day mortality 90-day mortality 180-day mortality Accuracy Logistic regression (fewer predictors) 94.6% 93.0% 83.4% Logistic regression 94.3% 92.8% 84.5% Regularized logistic regression 94.9% 93.1% 84.5% CART decision tree 93.6% 92.1% 85.0% Optimal Classification Trees 94.9% 93.3% 86.1% Gradient boosted trees 94.9% 93.6% 87.2% AUC Logistic regression (fewer predictors) 0.74 0.76 0.76 Logistic regression 0.73 0.74 0.75 Regularized logistic regression 0.79 0.80 0.80 CART decision tree 0.82 0.82 0.80 Optimal Classification Trees 0.86 0.84 0.83 Gradient boosted trees 0.90 0.89 0.87

Summary We built an actionable tool for cancer mortality prediction. It makes a significant contributions to clinical oncology, as it is: Personalized and specific Interpretable and clinically meaningful Evidence based and data driven Actionable Validated and accurate Based on state-of-the-art machine learning

Machine Learning for Emergency Surgery Jack Dunn PhD student at the MIT Operations Research Center #MITSloanHSI

Overview Diverse nature of patients makes it difficult to predict risks arising from emergency surgery Existing methods for risk prediction rely on subjective data, are not interpretable, or have low accuracy Goal: Train highly accurate and interpretable predictive models using EHR data Our approach: Build Optimal Tree models to predict risks of mortality and 18 other surgical complications Significantly higher prediction quality than existing methods Highly interpretable: presented as interactive application for physicians #MITSloanHSI

Estimating Risk of Emergency Surgery Over 130 million emergency department visits in the US annually 27 million hospital admissions related to emergency surgery from 2001-2010 Emergency surgery carries heightened risk of death and other complications Knowing the risk of post-surgery complications is critical for decision making by physicians, patients and families Understanding risk going into a procedure and comparing to actual outcome permits evaluation of hospitals, departments and physicians #MITSloanHSI

The Current State-of-the-art Many scoring mechanisms for estimating risk based on patient s demographic information, lab results, and co-morbidities Mostly based on logistic regression models Most comprehensive approach is the ACS-NSQIP Surgical Risk Calculator: Constructed using the ACS-NSQIP database of surgery patients Input all patient information and get probabilities of each complication #MITSloanHSI

ACS-NSQIP Risk Calculator #MITSloanHSI

ACS-NSQIP Risk Calculator #MITSloanHSI

Limitations of the ACS Calculator Black-box Model: Can t see how it works or understand the estimates One-size-fits-all Every factor is needed to make a prediction Each factor contributes to the risk additively Difficult to use Must specify the exact procedure code (often unknown beforehand) #MITSloanHSI

Our Approach Develop a model for predicting the risks of emergency surgery that: Provides better estimates than the ACS calculator Is interpretable and understandable Gives personalized predictions for each patient Is easy for physicians to use Use Optimal Trees, a novel method for constructing decision trees with state-of-the-art performance whilst maintaining interpretability #MITSloanHSI

Example Optimal Tree for Mortality #MITSloanHSI

Performance of Optimal Trees Risk of mortality AUC of 0.916 compared to 0.898 for ACS Calculator Risk of any complication AUC of 0.841 compared to 0.806 for ACS Calculator Risks of individual complications AUCs as high as 0.934 #MITSloanHSI

Actionable Tool for Physicians Our models are interpretable and deliver state-of-the-art performance but still need to be communicated to physicians We designed and built an application for use by physicians Interactive questionnaire that calculates risk Questions adapt to previous answers as they go down the tree Integrate with EHR system to answer questions where possible #MITSloanHSI

Interactive Application #MITSloanHSI

Interactive Application #MITSloanHSI

Interactive Application #MITSloanHSI

Interactive Application #MITSloanHSI

Interactive Application #MITSloanHSI

Interactive Application #MITSloanHSI

Key Takeaways Optimal Trees is a general-purpose machine learning algorithm that is both accurate and interpretable Applying OCT to predict risks of emergency surgery improves significantly on the state of the art Most importantly, the model and predictions are understandable and interpretable: Allows delivery of risk predictions as an application suitable for everyday use by physicians #MITSloanHSI