Predicting Non-Small Cell Lung Cancer Diagnosis and Prognosis by Fully Automated Microscopic Pathology Image Features

Similar documents
Identifying Thyroid Carcinoma Subtypes and Outcomes through Gene Expression Data Kun-Hsing Yu, Wei Wang, Chung-Yu Wang

Final Project Report Sean Fischer CS229 Introduction

Application of Artificial Neural Network-Based Survival Analysis on Two Breast Cancer Datasets

NAÏVE BAYESIAN CLASSIFIER FOR ACUTE LYMPHOCYTIC LEUKEMIA DETECTION

Survival Prediction Models for Estimating the Benefit of Post-Operative Radiation Therapy for Gallbladder Cancer and Lung Cancer

Colon cancer subtypes from gene expression data

KHAZAR KHORRAMI CANCER DETECTION FROM HISTOPATHOLOGY IMAGES

Improved Hepatic Fibrosis Grading Using Point Shear Wave Elastography and Machine Learning

Brain Tumour Detection of MR Image Using Naïve Beyer classifier and Support Vector Machine

Decision Support System for Skin Cancer Diagnosis

Diagnosis of Breast Cancer Using Ensemble of Data Mining Classification Methods

AUTOMATING NEUROLOGICAL DISEASE DIAGNOSIS USING STRUCTURAL MR BRAIN SCAN FEATURES

Predicting clinical outcomes in neuroblastoma with genomic data integration

Lung Cancer Detection using CT Scan Images

Machine-Learning on Prediction of Inherited Genomic Susceptibility for 20 Major Cancers

SVM-Kmeans: Support Vector Machine based on Kmeans Clustering for Breast Cancer Diagnosis

CLASSIFICATION OF BREAST CANCER INTO BENIGN AND MALIGNANT USING SUPPORT VECTOR MACHINES

Applying Tissue Phenomics to Colorectal Clinical Questions

Visual interpretation in pathology

Machine Learning for Predicting Delayed Onset Trauma Following Ischemic Stroke

Big Image-Omics Data Analytics for Clinical Outcome Prediction

The 16th KJC Bioinformatics Symposium Integrative analysis identifies potential DNA methylation biomarkers for pan-cancer diagnosis and prognosis

Estimation of Breast Density and Feature Extraction of Mammographic Images

Automatic Diagnosis of Ovarian Carcinomas via Sparse Multiresolution Tissue Representation

Methods for Predicting Type 2 Diabetes

BREAST CANCER EPIDEMIOLOGY MODEL:

Building an Ensemble System for Diagnosing Masses in Mammograms

Solving Problems of Clustering and Classification of Cancer Diseases Based on DNA Methylation Data 1,2

Utilizing machine learning techniques to rapidly identify. MUC2 expression in colon cancer tissues

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Convolutional capsule network for classification of breast cancer histology images

Geisinger Clinic Annual Progress Report: 2011 Nonformula Grant

Classification of mammogram masses using selected texture, shape and margin features with multilayer perceptron classifier.

Supersparse Linear Integer Models for Interpretable Prediction. Berk Ustun Stefano Tracà Cynthia Rudin INFORMS 2013

Artificial Intelligence in Breast Imaging

Predicting Kidney Cancer Survival from Genomic Data

Combined Radiology and Pathology Classification of Brain Tumors

INTEGRATIVE ANALYSIS FOR LUNG ADENOCARCINOMA PREDICTS MORPHOLOGICAL FEATURES ASSOCIATED WITH GENETIC VARIATIONS *

Texture Analysis of Supraspinatus Ultrasound Image for Computer Aided Diagnostic System

Abstract. Background. Objective

MR Image classification using adaboost for brain tumor type

Classification of Mammograms using Gray-level Co-occurrence Matrix and Support Vector Machine Classifier

An automatic mammogram system: from screening to diagnosis. Inês Domingues

Yeast Cells Classification Machine Learning Approach to Discriminate Saccharomyces cerevisiae Yeast Cells Using Sophisticated Image Features.

The Contribution of Morphological Features in the Classification of Prostate Carcinoma in Digital Pathology Images

Classification of benign and malignant masses in breast mammograms

Algorithms in Nature. Pruning in neural networks

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

Palliative radiotherapy near the end of life for brain metastases from lung cancer: a populationbased

An Efficient Diseases Classifier based on Microarray Datasets using Clustering ANOVA Extreme Learning Machine (CAELM)

The Role of Face Parts in Gender Recognition

Predicting Breast Cancer Survival Using Treatment and Patient Factors

Augmented Medical Decisions

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

for the TCGA Breast Phenotype Research Group

Deep learning approaches to medical applications

AUTOMATIC QUANTIFICATION AND CLASSIFICATION OF CERVICAL CANCER VIA ADAPTIVE NUCLEUS SHAPE MODELING

Predictive Models for Making Patient Screening Decisions

Ultrasound radio-frequency time series for finding malignant breast lesions

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Comparative Study of Classification System using K-NN, SVM and Adaboost for Multiple Sclerosis and Tumor Lesions using Brain MRI

Increasing Efficiency of Microarray Analysis by PCA and Machine Learning Methods

ABSTRACT I. INTRODUCTION. Mohd Thousif Ahemad TSKC Faculty Nagarjuna Govt. College(A) Nalgonda, Telangana, India

Cell Orientation Entropy (COrE): Predicting Biochemical Recurrence from Prostate Cancer Tissue Microarrays

Artificial Intelligence Based Semi-automated Screening of Cervical Cancer Using a Primary Training Database

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Classification of breast cancer histology images using transfer learning

Selection of Patient Samples and Genes for Outcome Prediction

Identification of Tissue Independent Cancer Driver Genes

CS229 Final Project Report. Predicting Epitopes for MHC Molecules

CANCER DIAGNOSIS USING NAIVE BAYES ALGORITHM

Investigating the performance of a CAD x scheme for mammography in specific BIRADS categories

8/1/2018. Radiomics Certificate, AAPM Radiomics Certificate, AAPM Introduction to Radiomics

Large-Scale Statistical Modelling via Machine Learning Classifiers

MR-Radiomics in Neuro-Oncology

Australian Journal of Basic and Applied Sciences

Extraction and Identification of Tumor Regions from MRI using Zernike Moments and SVM

Algorithms Implemented for Cancer Gene Searching and Classifications

Predictive Data Mining for Lung Nodule Interpretation

Quantitative Diagnosis of Tongue Cancer from Histological Images in an Animal Model

Deep-Learning Based Semantic Labeling for 2D Mammography & Comparison of Complexity for Machine Learning Tasks

Nature Genetics: doi: /ng Supplementary Figure 1. SEER data for male and female cancer incidence from

Computer Age Statistical Inference. Algorithms, Evidence, and Data Science. BRADLEY EFRON Stanford University, California

Image analysis in IHC overview, considerations and applications

A Survey on Detection and Classification of Brain Tumor from MRI Brain Images using Image Processing Techniques

Leveraging Expert Knowledge to Improve Machine-Learned Decision Support Systems

MRI Image Processing Operations for Brain Tumor Detection

Improved Intelligent Classification Technique Based On Support Vector Machines

Rajiv Gandhi College of Engineering, Chandrapur

Imaging Collaboration: From Pen and Ink to Artificial Intelligence June 2, 2018

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

Testing Statistical Models to Improve Screening of Lung Cancer

Detect the Stage Wise Lung Nodule for CT Images Using SVM

Predictive and Prognostic Pathomics:

Statement of research interest

Nuclear Pleomorphism Scoring by Selective Cell Nuclei Detection

CD133 Protein Expression as a Biomarker for Early Detection of Gastric Cancer.

Supplementary Materials for

Sequencing studies implicate inherited mutations in autism

SNPrints: Defining SNP signatures for prediction of onset in complex diseases

Transcription:

Predicting Non-Small Cell Lung Cancer Diagnosis and Prognosis by Fully Automated Microscopic Pathology Image Features Kun-Hsing Yu, MD, PhD Department of Biomedical Informatics, Harvard Medical School November 5 th, 2017 1

Non-small Cell Lung Cancer Heavy disease burden 85% of lung cancer >2.1M new cases/year No. 1 cause of cancer-related deaths >1M deaths/year 2 deaths/minute Diverse clinical outcome Same histopathology-defined subtypes è different survival outcomes Jemal A et al. CA Cancer J Clin. 2011 Mar-Apr;61(2):69-90. Bianchi F et al. J Clin Invest. 2007 Nov; 117(11): 3436 3444. 2

Histopathology Definitive diagnosis of many complex diseases Performed by trained pathologists Defined disease types, but could be subjective Automated image processing pipelines enables the extraction of objective features Beck AH et al. Sci Transl Med. 2011 Nov 9;3(108):108ra113. 3 National Institutes of Health

Extracting Nuclei / Cytoplasm Features Features Statistics Area Compactness Eccentricity Major/minor axis length Perimeter Pixel intensity distribution Haralick texture features Nucleus/cytoplasm ratio etc. Total: 9,879 features x Mean Median Percentiles Variance 4

Machine Learning Methods Supervised learning Decision trees Naïve Bayes (NB) classifiers Support vector machines (SVM) Ensemble: random forests Elastic net-cox survival models Feature selection Information content measurements Yu, KH et al. American Medical Informatics Association 2013 Annual Symposium. 5

Evaluations Cross-validation Parameter estimation Evaluate by independent test sets Held-out datasets from TCGA An external validation set from the Stanford Tissue Microarray (TMA) Database Yu KH et al. Nat Commun. 2016 Aug 16;7:12474. 6

Examining the Utility of the Extracted Features Diagnosis classification Histopathology features define cancer types A useful set of features should be able to recapitulate diagnostic patterns Pathology evaluation is laborious and subjective κ=0.55-0.59 for classifying LUAD and LUSC1 1Grilley-Olson 7 JE, et al. Archives of pathology & laboratory medicine 137, 32-40 (2013)

Fully Automated Image Features Identified Images with Malignant Cells (A) LUAD versus Benign (B) LUSC versus Benign With 80 features selected by information gain ratio Sensitivity Bagging Naive Bayes Random Forest Random Forest with CITs SVMs with Gaussian Kernel SVMs with Linear Kernel SVMs with Polynomial Kernel Sensitivity Bagging Naive Bayes Random Forest Random Forest with CITs SVMs with Gaussian Kernel SVMs with Linear Kernel SVMs with Polynomial Kernel AUC=0.73-0.85 AUC=0.77-0.88 1 Specificity 1 Specificity Top features: Radial distribution of nuclei pixels, Textures (pixel correlations, intensity variance) of the nuclei Yu KH et al. Nat Commun. 2016 Aug 16;7:12474. 8

Image Features Distinguished the Two Types of Lung Malignancy Sensitivity (A) TCGA dataset: AUC 0.7 Bagging Naive Bayes Random Forest Random Forest with CITs SVMs with Gaussian Kernel SVMs with Linear Kernel SVMs with Polynomial Kernel Sensitivity (B) TMA dataset: AUC=0.73-0.85 With 240 features selected by information gain ratio Bagging Conditional Inference Trees Naive Bayes Random Forest Random Forest with CITs SVMs with Gaussian Kernel SVMs with Linear Kernel SVMs with Polynomial Kernel SVMs with Sigmoid Kernel 1 Specificity 1 Specificity Top features: Intensity distribution in the nuclei, Textures of the nuclei Yu KH et al. Nat Commun. 2016 Aug 16;7:12474. 9

Probability of Survival Stage and Grade are Insufficient to Predict Adenocarcinoma Patient Survival (A) Survival stratified by stage Survival Groups Stage I Stage II Stage III Stage IV P<0.01 0 50 100 150 200 Months Probability of Survival (B) Stage I patient survival stratified by grade P=0.06 Histology Grade Grade 1 Grade 1 2 Grade 2 Grade 2 3 Grade 3 0 50 100 150 200 Months Great diversity in Stage I patient survival Pathology grade did NOT significantly correlate with survival Yu KH et al. Nat Commun. 2016 Aug 16;7:12474. 10

Probability of Survival Image Features Predicted Prognosis in Stage I Adenocarcinoma Patients (A) Image features predicted the prognosis of TCGA stage I patients Predicted Prognostic Groups Longer-term Survivors Shorter-term Survivors 0 50 100 150 200 Months P=23 Probability of Survival (B) Validated in TMA P=0.028 Predicted Prognostic Groups Longer-term Survivors Shorter-term Survivors 0 50 100 150 Months Quantitative features predicted survival, validated in TMA Top features: Zernike shape features of the nuclei, intensity distribution in the cytoplasm Yu KH et al. Nat Commun. 2016 Aug 16;7:12474. 11

Stage and Grade are Insufficient to Predict LUSC Patient Survival Probability of Survival (A) Survival stratified by stage P=0.191 Survival Groups Stage I Stage II Stage III Stage IV 0 50 100 150 Months (B) Stage I patient stratified by grade Probability of Survival P=0.847 Histology Grade Grade 1 Grade 1 2 Grade 2 Grade 2 3 Grade 3 Grade 3 4 Grade 4 0 50 100 150 Months Neither pathology stage nor grade was significantly associated with survival Yu KH et al. Nat Commun. 2016 Aug 16;7:12474. 12

Image Features Predicted Prognosis in Squamous Cell Carcinoma Patients Probability of Survival (A) Image features predicted prognosis of TCGA patients Predicted Prognostic Groups Longer-term Survivors Shorter-term Survivors P=0.023 0 50 100 150 Months Probability of Survival (B) Validated in TMA P=0.035 Predicted Prognostic Groups Longer-term Survivors Shorter-term Survivors 0 50 100 150 Months Quantitative features predicted survival, validated in TMA Top features: Zernike shape features and textures of the nuclei Yu KH et al. Nat Commun. 2016 Aug 16;7:12474. 13

Summary Developed a fully-automated algorithm to extract quantitative features from histopathology images Demonstrated the utility of texture and shape features in prognosis prediction 14

Acknowledgements Zak Lab Isaac S. Kohane, MD, PhD Nathan Palmer, PhD Arjun Manrai, PhD William Yuan, MS Oren Miron, MS Sam Finlayson, MS Vincent Hu, BS Samantha Lemos, BA Snyder Lab Michael Snyder, PhD Jingjing Li, PhD Collin Melton, PhD Konrad Karczewski, PhD Altman Lab Russ B. Altman, MD, PhD Bethany Percha, PhD Weizhuang Zhou, MS Emily Mallory, MS Ré Lab Christopher Ré, PhD Ce Zhang, PhD Feiran Wang, MS Collaborators Daniel Rubin, MD, MS Gerald Berry, MD Matt van de Rijn, MD, PhD Funding Harvard Data Science Fellowship Howard Hughes Medical Institute (HHMI) International Student Research Fellowship Stanford Graduate Fellowship (SGF) 15

Thank you. J khyu@stanford.edu 16