Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Similar documents
Available online at

Examining Genetics and Genomics of Acute Myeloid Leukemia in 2017

Application of Deep Learning on Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Next Generation Sequencing in Haematological Malignancy: A European Perspective. Wolfgang Kern, Munich Leukemia Laboratory

The Center for PERSONALIZED DIAGNOSTICS

Changing AML Outcomes via Personalized Medicine: Transforming Cancer Management with Genetic Insight

Big Image-Omics Data Analytics for Clinical Outcome Prediction

Illumina Trusight Myeloid Panel validation A R FHAN R A FIQ

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

Predicting Kidney Cancer Survival from Genomic Data

Welcome and Introductions

Supplementary Appendix

Concomitant WT1 mutations predicted poor prognosis in CEBPA double-mutated acute myeloid leukemia

Why did the network make this prediction?

CSE Introduction to High-Perfomance Deep Learning ImageNet & VGG. Jihyung Kil

About OMICS Group Conferences

COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks

New drugs in Acute Leukemia. Cristina Papayannidis, MD, PhD University of Bologna

Computational Cognitive Neuroscience

A HMM-based Pre-training Approach for Sequential Data

Mutational Impact on Diagnostic and Prognostic Evaluation of MDS

Learning in neural networks

International Journal of Computer Science Trends and Technology (IJCST) Volume 5 Issue 1, Jan Feb 2017

Kevin Kelly, MD, Phd Acute Myeloid and Lymphoid Leukemias

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

Auto-Encoder Pre-Training of Segmented-Memory Recurrent Neural Networks

EECS 433 Statistical Pattern Recognition

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

Skin cancer reorganization and classification with deep neural network

ACUTE LEUKEMIA CLASSIFICATION USING CONVOLUTION NEURAL NETWORK IN CLINICAL DECISION SUPPORT SYSTEM

Cardiac Arrest Prediction to Prevent Code Blue Situation

AML Genomics 11/27/17. Normal neutrophil maturation. Acute Myeloid Leukemia (AML) = block in differentiation. Myelomonocy9c FAB M5

Supplemental Material. The new provisional WHO entity RUNX1 mutated AML shows specific genetics without prognostic influence of dysplasia

Case Studies on High Throughput Gene Expression Data Kun Huang, PhD Raghu Machiraju, PhD

Convolutional and LSTM Neural Networks

Artificial Intelligence in Breast Imaging

A Fuzzy Improved Neural based Soft Computing Approach for Pest Disease Prediction

Supplementary Information

Lung Region Segmentation using Artificial Neural Network Hopfield Model for Cancer Diagnosis in Thorax CT Images

Predicting Breast Cancer Survivability Rates

Minimum Feature Selection for Epileptic Seizure Classification using Wavelet-based Feature Extraction and a Fuzzy Neural Network

N Engl J Med Volume 373(12): September 17, 2015

Published Ahead of Print on April 14, 2016, as doi: /haematol Copyright 2016 Ferrata Storti Foundation.

An Improved Algorithm To Predict Recurrence Of Breast Cancer

Acute Myeloid Leukemia Progress at last

Juan Ma 1, Jennifer Dunlap 2, Lisong Shen 1, Guang Fan 2 1

DIABETIC RISK PREDICTION FOR WOMEN USING BOOTSTRAP AGGREGATION ON BACK-PROPAGATION NEURAL NETWORKS

Please Silence Your Cell Phones. Thank You

Lecture Outline Biost 517 Applied Biostatistics I

GENETIC TESTING FOR FLT3, NPM1 AND CEBPA VARIANTS IN CYTOGENETICALLY NORMAL ACUTE MYELOID LEUKEMIA

CPSC81 Final Paper: Facial Expression Recognition Using CNNs

Impact of Biomarkers in the Management of Patients with Acute Myeloid Leukemia

Augmented Medical Decisions

Corporate Medical Policy. Policy Effective February 23, 2018

Laboratory Service Report

Predicting clinical outcomes in neuroblastoma with genomic data integration

PREDICTION OF BREAST CANCER USING STACKING ENSEMBLE APPROACH

Leukemia Blood Cell Image Classification Using Convolutional Neural Network

Genomic Medicine: What every pathologist needs to know

The preclinical efficacy of a novel telomerase inhibitor, imetelstat, in AML: A randomized trial in patient-derived xenografts

Blastic Plasmacytoid Dendritic Cell Neoplasm with DNMT3A and TET2 mutations (SH )

National Academies Next Generation SAMPLE Researchers TITLE Initiative HERE

Aristomenis Kotsakis,Matthias Nübling, Nikolaos P. Bakas, George Pelekanakis, John Thanopoulos

On Training of Deep Neural Network. Lornechen

Flexible, High Performance Convolutional Neural Networks for Image Classification

Feature selection methods for early predictive biomarker discovery using untargeted metabolomic data

Panel: Machine Learning in Surgery and Cancer

BACKPROPOGATION NEURAL NETWORK FOR PREDICTION OF HEART DISEASE

ESTABLISHED AND EMERGING THERAPIES FOR ACUTE MYELOID LEUKAEMIA. Dr Rob Sellar UCL Cancer Institute, London, UK

Does Machine Learning. In a Learning Health System?

Molecular Markers. Marcie Riches, MD, MS Associate Professor University of North Carolina Scientific Director, Infection and Immune Reconstitution WC

Automatic Context-Aware Image Captioning

Molecular Markers in Acute Leukemia. Dr Muhd Zanapiah Zakaria Hospital Ampang

Classification and risk assessment in AML: integrating cytogenetics and molecular profiling

Out-Patient Billing CPT Codes

3D Deep Learning for Multi-modal Imaging-Guided Survival Time Prediction of Brain Tumor Patients

Applying Machine Learning Techniques to Analysis of Gene Expression Data: Cancer Diagnosis

Review: Logistic regression, Gaussian naïve Bayes, linear regression, and their connections

Clonal Evolution of saml. Johnnie J. Orozco Hematology Fellows Conference May 11, 2012

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning

Predicting Breast Cancer Survival Using Treatment and Patient Factors

BLADDERSCAN PRIME PLUS TM DEEP LEARNING

Visual interpretation in pathology

International Journal of Pharma and Bio Sciences A NOVEL SUBSET SELECTION FOR CLASSIFICATION OF DIABETES DATASET BY ITERATIVE METHODS ABSTRACT

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence

A hybrid Model to Estimate Cirrhosis Using Laboratory Testsand Multilayer Perceptron (MLP) Neural Networks

TEST MENU TEST CPT CODES TAT. Chromosome Analysis Bone Marrow x 2, 88264, x 3, Days

Gene Selection for Tumor Classification Using Microarray Gene Expression Data

Next generation sequencing analysis - A UK perspective. Nicholas Lea

Data mining for Obstructive Sleep Apnea Detection. 18 October 2017 Konstantinos Nikolaidis

Artificial Neural Networks (Ref: Negnevitsky, M. Artificial Intelligence, Chapter 6)

Biost 590: Statistical Consulting

Learning Classifier Systems (LCS/XCSF)

TIME SERIES MODELING USING ARTIFICIAL NEURAL NETWORKS 1 P.Ram Kumar, 2 M.V.Ramana Murthy, 3 D.Eashwar, 4 M.Venkatdas

Disclosure: Objectives/Outline. Leukemia: Genealogy of Pathology Practice: Old Diseases New Expectations. Nothing to disclose.

A Hierarchical Artificial Neural Network Model for Giemsa-Stained Human Chromosome Classification

A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition

Nature Medicine: doi: /nm.4439

Australian Journal of Basic and Applied Sciences

Transcription:

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Andy Nguyen, M.D., M.S. Medical Director, Hematopathology, Hematology and Coagulation Laboratory, Memorial Hermann Laboratory Professor of Pathology and Laboratory Medicine, University of Texas-Houston, Medical School GeneMed Feb 2018

Outline of talk Define application of Deep Learning method, a technological breakthrough, to big-data analytics Describe our study using Deep Learning algorithm to predict prognosis for acute myeloid leukemia (AML) using cytogenetics, age, and mutations Financial Disclosures: No relevant financial relationships with commercial interests to disclose

Deep learning and genomic medicine Big companies are analyzing large volumes of data for business analysis and decisions, using Deep Learning technology (Google s search engine, Google Photo, automobile companies: self-driving cars, IBM s Watson). The application of deep learning to genomic medicine has a promising start; it could impact personalized diagnostics, and treatment. The genotype-phenotype divide, our inability to connect genetics to disease phenotypes, is preventing genomics from advancing medicine to its potential. Deep learning can bridge the genotype-phenotype divide, by incorporating an exponentially growing amount of data, and accounting for the multiple layers of complex biological processes that relate the genotype to the phenotype. This gap necessitates the application of Deep learning, a more recent type of machine learning

Machine Learning Machine learning: explores the construction of algorithms that can learn from and make predictions on data - i.e. gives software the ability to learn without being explicitly programmed Numerous machine learning methods: Decision tree, Cluster analysis, Support vector machine, Random forest, Bayesian, Regression analysis, Neural network. Neural network (inspired by biological neural networks): artificial nodes ("neurons ) are connected together to form a network for prediction/classification tasks Fire/not Fire

Early Generation of Neural Networks with Supervised Training (model is trained with known or labeled outcomes) pos neg Calculating connection weights 3 types of data: -Training -Validation -Testing -New (for new cases)

Disadvantages of Early Networks They relies only on labeled data (known outcomes) for training. However, labeled data is often limited, and thus for many problems it is difficult to get enough examples to fit the parameters of a complex model. Given the high degree of expressive power of deep networks, training on insufficient data would also result in overfitting. Training a network with multiple hidden layers using supervised learning: (1) Parameters often do not converge; i.e. being stuck in local minim; (2) Model not scaling well (diffusion of gradients causing poor learning in earlier hidden layers) Iterations: Convergence to minimize error Local minima: No convergence

Deep Learning (3 rd Gen Neural Network) A major breakthrough in 2006: Hinton (U of Toronto) won a contest held by Merck to identify molecules that could lead to new drugs. The group used deep learning to zero in on the molecules most likely to bind to their targets. Deep Learning algorithms: (1) Unsupervised learning ->allows a network to be fed with raw data (no known outcomes) and to automatically discover the representations needed for detection or classification (2) Extract high-level & complex data representations through multiple layers; avoid problems of last-gen networks (previous slide) Supporting hardware: multiple graphics processing units (GPU)

A Deep Learning Neural Network to Detect Image: Extracting higher-level Features With Unsupervised Learning Feature extraction: -Each hidden layer applies a nonlinear transformation on its input to transform the input to higher level of representation in its output. -Multiple levels of abstraction of the image: from pixels to complex shapes and objects defining a human face -Deep learning process works similarly for non-visual objects

Our Study Objective: AML Prognosis The risk stratification of acute myeloid leukemia (AML) based on recurrent chromosome abnormalities has been well established. Similarly, some mutations in AML cases with no chromosome abnormalities are known to play a role in risk stratification. Multivariate statistic analysis becomes a challenge with addition of numerous input variables (mutations from next-gen sequencing). Risk classification is difficult to assess for a patient with a particular profile N ENGL J MED 366;12 Mar 22, 2012

Our Hypothesis Hypothesis for this study: Deep Learning can be utilized to accurately predict prognosis of AML using combined data from several sources. Specifically we attempt to determine the correlation between prognosis and cytogenetics, age, and mutations in acute myeloid leukemia. Our Study Materials 94 AML cases from TCGA (The Cancer Genome Atlas) database. Data include cytogenetics, age, mutations, prognosis PX (days to death, DTD). Cytogenetics (10 common abnormalities): t(8;21), inv(16), t(15;17), t(9:11), t(9;22), trisomy 8, del (7), del (5), del (20), complex chr abnls Mutations (23 most common): FLT3, NPM1, DNMT3A, IDH2, IDH1, TET2,RUNX1, TP53, NRAS, CEBPA, WT1, PTPN11, KIT, U2AF1, KRAS, SMC1A, SMC3, PHF6, STAG2, RAD21, FAM5C, EZH2, HNRNPK

Method: Deep Learning- Programming Platform We design Deep Learning neural networks with stacked (multi-layered) autoencoder in R language. R is a programming language for statistical computing and graphics supported by the R Foundation for Statistical Computing, currently used extensively in machine learning In this study, we use functions obtained from R packages which are available from the Comprehensive R Archive Network Stacked autoencoder network: Pre-training with unlabeled data (unsupervised) -> Stacked Autoencoder Algorithm <- Fine-tuning with labeled data (supervised)

AML Prognosis-Results Our Deep Learning model which incorporates unsupervised feature training find excellent correlation between prognosis PX (DTD) with 10 cytogenetics, abnormalities age, 23 most common mutations Median DTD is 730 days Good PX for DTD >730 days; Poor PX for DTD<= 730 days Ten-fold validation: exclude 10% of the cases at a time to train the network and use the resultant network to test these excluded cases Mean Accuracy of 81%, Cross Validation Data Sets Sensitivity of 74%, and Specificity of 86% 1 90% Mean accuracy if some input is excluded: Exclusion of cytogenetics -> 67% Exclusion of mutations -> 74% Exclusion of age -> 61% This indicate critical contribution of all input categories 2 80% 3 80% 4 90% 5 100% 6 80% 7 70% 8 70% 9 80% 10 75% Mean Accuracy: 81%

The Ranking of all Input Data used in Training (The ranking of input data is based on the sum of the absolute weights of the connections from the input node to all the nodes in the first hidden layer) Using the 14 top-ranked attributes (out of 34) -7 chrom abls: tri8, del5, del7, Complex, t(8;21), inv(16), t(15;17) -Age -6 mutations: FLT3, NPM1,TP53, DNMT3, KIT, CEBPA Accuracy= 83% (slightly better than 81% using 34 original attributes, likely due to data redundancy)

SUMMARY Deep Learning method, a disruptive technology, is predicted to be an integrated part in future practice in molecular diagnosis & prognosis prediction using nextgen sequencing data. Our preliminary study demonstrated a practical application in this area Limitations of our preliminary study: - The relatively small size of cohorts (94 cases) due to limited data in TCGA database - This study nevertheless provides excellent preliminary results for future studies that include many more cases, more mutation data, and other clinical data such as co-morbidity index. With more data, the expected accuracy would be higher than that of this preliminary study (> 83%) The successful validation of such deep learning software would be of tremendous value to personalized treatment of AML patients, i.e. stratifying treatment for each patient based on predicted prognosis The software s database can be continually kept up-to-date by adding new patients data (with new tests, etc.) to preserve its predicting ability Using input ranking techniques, critical parameters which impact prognosis can be detected -> helps to identify sets of important data to predict prognosis (novel patterns)