Convolutional and LSTM Neural Networks

Similar documents
Convolutional and LSTM Neural Networks

COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

Immuno-Oncology Therapies and Precision Medicine: Personal Tumor-Specific Neoantigen Prediction by Machine Learning

Convolutional Neural Networks for Text Classification

Flexible, High Performance Convolutional Neural Networks for Image Classification

Image Classification with TensorFlow: Radiomics 1p/19q Chromosome Status Classification Using Deep Learning

Image Captioning using Reinforcement Learning. Presentation by: Samarth Gupta

Machine Learning For Personalized Cancer Vaccines. Alex Rubinsteyn February 9th, 2018 Data Science Salon Miami

CSE Introduction to High-Perfomance Deep Learning ImageNet & VGG. Jihyung Kil

Visual semantics: image elements. Symbols Objects People Poses

The Immune Epitope Database Analysis Resource: MHC class I peptide binding predictions. Edita Karosiene, Ph.D.

Question 1 Multiple Choice (8 marks)

Neuromorphic convolutional recurrent neural network for road safety or safety near the road

Convolutional Neural Networks for Estimating Left Ventricular Volume

Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision

Antigen Recognition by T cells

Recurrent Neural Networks

Application of Artificial Neural Networks in Classification of Autism Diagnosis Based on Gene Expression Signatures

IMMUNOINFORMATICS: Bioinformatics Challenges in Immunology

NIH Public Access Author Manuscript Immunogenetics. Author manuscript; available in PMC 2014 September 01.

Using stigmergy to incorporate the time into artificial neural networks

Artificial Neural Networks

arxiv: v1 [stat.ml] 23 Jan 2017

Learning Convolutional Neural Networks for Graphs

Factoid Question Answering

Predicting Breast Cancer Survivability Rates

Practical Bayesian Optimization of Machine Learning Algorithms. Jasper Snoek, Ryan Adams, Hugo LaRochelle NIPS 2012

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

CS-E Deep Learning Session 4: Convolutional Networks

MHC class I MHC class II Structure of MHC antigens:

Medical Immunology. Contents and lecturing hours. immune system. antigen and adhesion molecule. Major histocompatibility. antigen.

Attribution: University of Michigan Medical School, Department of Microbiology and Immunology

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Why did the network make this prediction?

DIABETIC RISK PREDICTION FOR WOMEN USING BOOTSTRAP AGGREGATION ON BACK-PROPAGATION NEURAL NETWORKS

Contents. Just Classifier? Rules. Rules: example. Classification Rule Generation for Bioinformatics. Rule Extraction from a trained network

CS229 Final Project Report. Predicting Epitopes for MHC Molecules

An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns

Course Introduction. Neural Information Processing: Introduction. Notes. Administration

Cognitive Neuroscience History of Neural Networks in Artificial Intelligence The concept of neural network in artificial intelligence

The Impact of Visual Saliency Prediction in Image Classification

Basic Immunology. Lecture 5 th and 6 th Recognition by MHC. Antigen presentation and MHC restriction

Deep Learning for Computer Vision

Helminth worm, Schistosomiasis Trypanosomes, sleeping sickness Pneumocystis carinii. Ringworm fungus HIV Influenza

Sequential Predictions Recurrent Neural Networks

MASTER S THESIS. Ali El Adi. Deep neural networks to forecast cardiac and respiratory deterioration of intensive care patients

Summary and discussion of: Why Does Unsupervised Pre-training Help Deep Learning?

Deep learning and non-negative matrix factorization in recognition of mammograms

POC Brain Tumor Segmentation. vlife Use Case

Co-evolution of host and pathogen: HIV as a model. Can Keşmir Theoretical Biology/Bioinformatics Utrecht University, NL

Audiovisual to Sign Language Translator

MetaMHC: a meta approach to predict peptides binding to MHC molecules

Auto-Encoder Pre-Training of Segmented-Memory Recurrent Neural Networks

Medical Knowledge Attention Enhanced Neural Model. for Named Entity Recognition in Chinese EMR

The Major Histocompatibility Complex (MHC)

Deep learning on biomedical images. Ruben Hemelings Graduate VITO KU Leuven. Data Innova)on Summit March, #DIS2017

EPILEPTIFORM SPIKE DETECTION VIA CONVOLUTIONAL NEURAL NETWORKS

arxiv: v2 [cs.cv] 19 Dec 2017

arxiv: v2 [cs.lg] 3 Apr 2019

Aggregated Sparse Attention for Steering Angle Prediction

Reduction of Overfitting in Diabetes Prediction Using Deep Learning Neural Network

Satoru Hiwa, 1 Kenya Hanawa, 2 Ryota Tamura, 2 Keisuke Hachisuka, 3 and Tomoyuki Hiroyasu Introduction

1. Introduction 1.1. About the content

Attribution: University of Michigan Medical School, Department of Microbiology and Immunology

Antigen capture and presentation to T lymphocytes

1. Introduction 1.1. About the content. 1.2 On the origin and development of neurocomputing

J2.6 Imputation of missing data with nonlinear relationships

arxiv: v1 [cs.lg] 6 Oct 2016

Diabetic Retinopathy Detection Using Eye Images

ANN predicts locoregional control using molecular marker profiles of. Head and Neck squamous cell carcinoma

A HMM-based Pre-training Approach for Sequential Data

Machine learning for neural decoding

Disruption of Healthy Tissue by the Immune Response Autoimmune diseases: Inappropriate immune response against self-components

On Training of Deep Neural Network. Lornechen

Recognition of sign language gestures using neural networks

Classıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon Network

CPSC81 Final Paper: Facial Expression Recognition Using CNNs

Deep Compression and EIE: Efficient Inference Engine on Compressed Deep Neural Network

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.

PMR5406 Redes Neurais e Lógica Fuzzy. Aula 5 Alguns Exemplos

Introduction and Historical Background. August 22, 2007

Multilayer Perceptron Neural Network Classification of Malignant Breast. Mass

Automatic Context-Aware Image Captioning

Automatic Classification of Perceived Gender from Facial Images

ECG Signal Classification with Deep Learning Techniques

Personalized Colorectal Cancer Survivability Prediction with Machine Learning Methods*

Sign Language Recognition using Convolutional Neural Networks

Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

Data mining for Obstructive Sleep Apnea Detection. 18 October 2017 Konstantinos Nikolaidis

Automated diagnosis of pneumothorax using an ensemble of convolutional neural networks with multi-sized chest radiography images

ML LAId bare. Cambridge Wireless SIG Meeting. Mary-Ann & Phil Claridge 23 November

Workshop presentation: Development and application of Bioinformatics methods

Artificial Neural Networks in Cardiology - ECG Wave Analysis and Diagnosis Using Backpropagation Neural Networks

Autoimmunity. By: Nadia Chanzu, PhD Student, UNITID Infectious Minds Presentation November 17, 2011

Retinopathy Net. Alberto Benavides Robert Dadashi Neel Vadoothker

arxiv: v2 [cs.lg] 1 Jun 2018

arxiv: v1 [cs.lg] 4 Feb 2019

Antigen Presentation to T lymphocytes

Machine Learning Models for Blood Glucose Level Prediction

Transcription:

Convolutional and LSTM Neural Networks Vanessa Jurtz January 12, 2016

Contents Neural networks and GPUs Lasagne Peptide binding to MHC class II molecules Convolutional Neural Networks (CNN) Recurrent and LSTM Neural Networks Hyper-parameter optimization Training CNN and LSTM ensembles

Neural networks and GPUs A CPU (central processing unit) has few cores with lots of cache memory A GPU (graphics processing unit) has hundreds of cores but each with little memory GPUs are very good for lots of small operations that can be heavily parallelized, like matrix multiplication training a neural network requires lots of matrix multiplications!

Lasagne Why use Lasagne? python it is highly flexible as you will see in the afternoon exercise code can be run on a CPU or GPU my collaborators are using it too Lasagne Documentation

Peptide binding to MHC class II molecules Figure adapted from http://events.embo.org/12-antigen/ MHC II only expressed in professional antigen presenting cells (dentritic cells, macrophages, B cells etc) only a small fraction of peptides bind MHC II and are potential T cell epitopes polygenic and highly polymorphic

Peptides binding to MHCII molecules Nielsen et al. 2010 MHC Class II epitope predictive algorithms Immunology Peptides binding HLA-DR3 adapted from Kenneth Murphy Janeway s Immunobiology 8th edition fig 4.21 Garland Science Why is this a hard problem? binding peptides have different lengths the binding core can lie anywhere within a peptide large number of different MHC class II molecules

CNN - Convolutional Neural Networks convolutional filter has specified input size slides over input sequence feeding into a different hidden neuron in each position weights of the filter are fixed as it slides over input and updated in backpropagation usually multiple filters of the same and/or different size are used

CNN - Convolutional Neural Networks Why could this be useful for predicting peptide binding to MHC class II? use filters of size 9 AA to focus on binding core integrate information about all 9-mers

CNN - Convolutional Neural Networks CNNs are known to work well for image analysis CIFAR example http://deeplearning.net/tutorial/lenet.html

CNN - Convolutional Neural Networks A biological example: subcellular localization of proteins Sønderby et al. 2015 Convolutional LSTM Networks for Subcellular Localization of Proteins http://arxiv.org/abs/1503.01919

Recurrent Neural Networks have connections between the neurons in the hidden layer have a memory learn what to store and what to ignore (Graves, 2015) Graves 2015 Supervised Sequence Labelling with Recurrent Neural Networks

LSTM memory blocks Instead of a hidden neuron we use an LSTM memory block: Graves 2015 Supervised Sequence Labelling with Recurrent Neural Networks

LSTM Networks Why make it more complicated? LSTM memory blocks allow to store and access information over longer periods of time Graves 2015 Supervised Sequence Labelling with Recurrent Neural Networks

LSTM Networks Input and output can be of variable length! http://karpathy.github.io/2015/05/21/rnn-effectiveness/ Example peptide MHC class II binding: many to one

LSTM Networks LSTM networks are known to work well for natural language processing check out Andrej Karpathys blog to read up: Graves 2015

LSTM Networks Another biological application: protein secondary structure prediction http://chemistry.berea.edu/~biochemistry/ Anthony/Aequorin/Aequorin3.php Sønderby et al. 2015 Convolutional LSTM Networks for Subcellular Localization of Proteins http://arxiv.org/abs/1503.01919

Hyper-parameter optimization Parameters: weights Hyper-parameters: weight initialization weight update activation function data encoding dropout learning rate number of CNN filters / number of LSTM memory blocks hidden layer size Bergstra et al 2012 Random Search for Hyper-Parameter Optimization Journal of Machine Learning Research

Hyper-parameter optimization- architectures CNN: LSTM: Peptide Convolutional layer MHC II pseudo sequence Fully connected layer Output filter size varied (9 AA, 9+3 AA)

Hyper-parameter optimization- results CNN: Best performance per number of trials 0.9 0.8 single network performance (10% best nets for CNN + LSTM, entire ensemble of NetMHCIIpan-3.0) AUC 0.7 0.6 0.5 0.75 0.70 validation evaluation 0.4 1 2 4 6 8 10 20 50 70 100 number of trials 0.65 LSTM: 0.9 Best performance per number of trials PCC 0.60 0.55 0.8 0.7 0.50 AUC 0.6 0.45 0.5 0.4 1 2 4 6 8 10 20 50 100 150 200 number of trials 0.40 CNN LSTM NetMHCIIpan-3.0

Training CNN and LSTM ensembles CNN ensemble of 10 best hyper parameter settings + 5 seeds each

Training CNN and LSTM ensembles LSTM ensemble of 2 best hyper parameter settings + 2 seeds each

Future work combine CNN and LSTM in one network make an ensemble of different network architectures: CNN, LSTM, feed forward try to visualize what the networks learn try to find a way to extract/visualize the binding core

Aknowledgements Casper Sønderby Søren Sønderby Ole Winther Morten Nielsen