Convolutional Neural Networks for Text Classification

Similar documents
COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks

Convolutional Neural Networks (CNN)

CSE Introduction to High-Perfomance Deep Learning ImageNet & VGG. Jihyung Kil

Convolutional and LSTM Neural Networks

Interpreting Deep Neural Networks and their Predictions

Convolutional and LSTM Neural Networks

A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images

Supplementary Online Content

Flexible, High Performance Convolutional Neural Networks for Image Classification

An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns

Medical Knowledge Attention Enhanced Neural Model. for Named Entity Recognition in Chinese EMR

Efficient Deep Model Selection

Improving the Interpretability of DEMUD on Image Data Sets

c Copyright 2017 Cody Burkard

Deep Learning for Computer Vision

arxiv: v1 [stat.ml] 23 Jan 2017

Learning Convolutional Neural Networks for Graphs

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.

CS-E Deep Learning Session 4: Convolutional Networks

On Training of Deep Neural Network. Lornechen

Factoid Question Answering

ACUTE LEUKEMIA CLASSIFICATION USING CONVOLUTION NEURAL NETWORK IN CLINICAL DECISION SUPPORT SYSTEM

Automated diagnosis of pneumothorax using an ensemble of convolutional neural networks with multi-sized chest radiography images

Identifying Cognitive Distortion by Convolutional Neural Network based Text Classification

Image Captioning using Reinforcement Learning. Presentation by: Samarth Gupta

PMR5406 Redes Neurais e Lógica Fuzzy. Aula 5 Alguns Exemplos

POC Brain Tumor Segmentation. vlife Use Case

Deep Learning based Information Extraction Framework on Chinese Electronic Health Records

Risk Prediction with Electronic Health Records: A Deep Learning Approach

arxiv: v2 [cs.cv] 19 Dec 2017

AN EFFICIENT DIGITAL SUPPORT SYSTEM FOR DIAGNOSING BRAIN TUMOR

Automatic Classification of Perceived Gender from Facial Images

Differential Attention for Visual Question Answering

A HMM-based Pre-training Approach for Sequential Data

Keywords Artificial Neural Networks (ANN), Echocardiogram, BPNN, RBFNN, Classification, survival Analysis.

arxiv: v1 [cs.ai] 28 Nov 2017

Tumor Cellularity Assessment. Rene Bidart

arxiv: v2 [cs.lg] 1 Jun 2018

A Deep Siamese Neural Network Learns the Human-Perceived Similarity Structure of Facial Expressions Without Explicit Categories

Shu Kong. Department of Computer Science, UC Irvine

Understanding Convolutional Neural

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Character-based Embedding Models and Reranking Strategies for Understanding Natural Language Meal Descriptions

Automatic Quality Assessment of Cardiac MRI

Convolutional Neural Networks for Estimating Left Ventricular Volume

Shu Kong. Department of Computer Science, UC Irvine

Retinopathy Net. Alberto Benavides Robert Dadashi Neel Vadoothker

3D Deep Learning for Multi-modal Imaging-Guided Survival Time Prediction of Brain Tumor Patients

Early Diagnosis of Autism Disease by Multi-channel CNNs

Sleep Staging with Deep Learning: A convolutional model

Deep learning and non-negative matrix factorization in recognition of mammograms

ECG Beat Recognition using Principal Components Analysis and Artificial Neural Network

CPSC81 Final Paper: Facial Expression Recognition Using CNNs

Skin cancer reorganization and classification with deep neural network

Using Deep Convolutional Networks for Gesture Recognition in American Sign Language

Grounding Ontologies in the External World

arxiv: v2 [cs.cv] 22 Mar 2018

arxiv: v2 [cs.cl] 12 Jul 2018

Vector Learning for Cross Domain Representations

arxiv: v1 [cs.cl] 15 Aug 2017

Discovering and Understanding Self-harm Images in Social Media. Neil O Hare, MFSec Bucharest, Romania, June 6 th, 2017

Deep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing

A CONVOLUTION NEURAL NETWORK ALGORITHM FOR BRAIN TUMOR IMAGE SEGMENTATION

Highly Accurate Brain Stroke Diagnostic System and Generative Lesion Model. Junghwan Cho, Ph.D. CAIDE Systems, Inc. Deep Learning R&D Team

Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports

Task 1: Machine Learning with Spike-Timing-Dependent Plasticity (STDP)

Satoru Hiwa, 1 Kenya Hanawa, 2 Ryota Tamura, 2 Keisuke Hachisuka, 3 and Tomoyuki Hiroyasu Introduction

ITERATIVELY TRAINING CLASSIFIERS FOR CIRCULATING TUMOR CELL DETECTION

Facial Expression Classification Using Convolutional Neural Network and Support Vector Machine

How can Natural Language Processing help MedDRA coding? April Andrew Winter Ph.D., Senior Life Science Specialist, Linguamatics

Chapter 1. Introduction

Understanding Biological Visual Attention Using Convolutional Neural Networks

Speech Emotion Detection and Analysis

Introduction to Machine Learning. Katherine Heller Deep Learning Summer School 2018

MASTER S THESIS. Ali El Adi. Deep neural networks to forecast cardiac and respiratory deterioration of intensive care patients

The Impact of Visual Saliency Prediction in Image Classification

EECS 433 Statistical Pattern Recognition

Artificial Neural Networks (Ref: Negnevitsky, M. Artificial Intelligence, Chapter 6)

DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation

Neuromorphic convolutional recurrent neural network for road safety or safety near the road

Rich feature hierarchies for accurate object detection and semantic segmentation

Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images

IN this paper we examine the role of shape prototypes in

Question 1 Multiple Choice (8 marks)

Hierarchical Convolutional Features for Visual Tracking

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

A Novel Capsule Neural Network Based Model For Drowsiness Detection Using Electroencephalography Signals

A Deep Learning Approach for Subject Independent Emotion Recognition from Facial Expressions

arxiv: v2 [cs.lg] 3 Apr 2019

Interpretable & Transparent Deep Learning

A Vision-based Affective Computing System. Jieyu Zhao Ningbo University, China

Automatic Context-Aware Image Captioning

Do Deep Neural Networks Suffer from Crowding?

Hand Sign to Bangla Speech: A Deep Learning in Vision based system for Recognizing Hand Sign Digits and Generating Bangla Speech

High-Resolution Breast Cancer Screening with Multi-View Deep Convolutional Neural Networks

Rumor Detection on Twitter with Tree-structured Recursive Neural Networks

Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record

IDENTIFYING STRESS BASED ON COMMUNICATIONS IN SOCIAL NETWORKS

Transcription:

Convolutional Neural Networks for Text Classification Sebastian Sierra MindLab Research Group July 1, 2016 ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 1 / 32

Outline 1 What is a Convolution? 2 What are Convolutional Neural Networks? 3 CNN for NLP 4 CNN hyperparameters 5 Example: The Model 6 Bibliography Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 2 / 32

Outline 1 What is a Convolution? 2 What are Convolutional Neural Networks? 3 CNN for NLP 4 CNN hyperparameters 5 Example: The Model 6 Bibliography Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 3 / 32

What is a Convolution? Convolutions are great for extracting features from Images. Convolutional Neural Networks (CNN) are biologically-inspired variants of MLPs Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 4 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Green: Input, Yellow: Convolutional Filter, Red: Output ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 5 / 32

Outline 1 What is a Convolution? 2 What are Convolutional Neural Networks? 3 CNN for NLP 4 CNN hyperparameters 5 Example: The Model 6 Bibliography Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 6 / 32

Convolutional Neural Network Figure 1: Close up of Convolutional Neural Network Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 7 / 32

Convolutional Neural Network CNNs are networks composed of several layers of convolutions with nonlinear activation functions like ReLU or tanh applied to the results. Traditional Layers are fully connected, instead CNN use local connections. Each layer applies different filters (thousands) like the ones showed above, and combines their results. Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 8 / 32

Properties of Convolutional Neural Networks Local Invariance Compositionality Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 9 / 32

Outline 1 What is a Convolution? 2 What are Convolutional Neural Networks? 3 CNN for NLP 4 CNN hyperparameters 5 Example: The Model 6 Bibliography Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 10 / 32

CNN in NLP Do they make sense in NLP? Perhaps Recurrent Neural Networks would make more sense trying to learn patterns extracted from a text sequence. They are not cognitively or linguistically plausible. Advantage There are fast GPU implementations for CNNs Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 11 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 12 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 13 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 14 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 15 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 16 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 17 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 18 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 19 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 20 / 32

An example of how CNN work Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 21 / 32

Char-CNN Let s try this model Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 22 / 32

Char-CNN Let s try this model C R d l : Matrix representation of a sequence of length l(140, 300,?). H R d w : Convolutional filter matrix where, d : Dimensionality of character embeddings (used 30) w : Width of convolution filter (3, 4, 5) ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 22 / 32

A simple architecture Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 23 / 32

Steps for applying a CNN 1 Apply a convolution between C and H to obtain a vector f R l w+1 f [i] = C[, i : i + w 1], H A, B is the Frobenius inner product. Tr(AB T ) 2 This vector f is also known as feature map. 3 Take the maximum value over time as the feature that represents filter H. (K-max pooling) f = relu(max{f [i]} + b) i 4 Then we do the same for all m filters. z = [ f i,..., f m ] Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 24 / 32

Intution behind Why ReLU and not tanh? Should I use multiple filter weights H? Should I use variable filter widths w? Can I add another channel as in Computer Vision domain? Is Max-Pooling capturing the most important activation? Would they capture morphological relations? Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 25 / 32

Outline 1 What is a Convolution? 2 What are Convolutional Neural Networks? 3 CNN for NLP 4 CNN hyperparameters 5 Example: The Model 6 Bibliography Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 26 / 32

Regularization tricks Use dropout (Gradients are only backpropagated through certain inputs of z). Constrain L 2 norms of weight vectors of each class(rows in Softmax matrx W (S) ) to a fixed number: If W c (S) > s, the rescale it so W c (S) = s. Early Stopping Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 27 / 32

CNN in NLP - Previous Work Previous works: NLP from scratch (Collobert et al. 2011). Sentence or paragraph modelling using words as input (Kim 2014; Kalchbrenner, Grefenstette, and Blunsom 2014; Johnson and T. Zhang 2015a; Johnson and T. Zhang 2015b). Text classification using characters as input (Kim et al. 2016; X. Zhang, Zhao, and LeCun 2015) Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 28 / 32

Outline 1 What is a Convolution? 2 What are Convolutional Neural Networks? 3 CNN for NLP 4 CNN hyperparameters 5 Example: The Model 6 Bibliography Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 29 / 32

Outline 1 What is a Convolution? 2 What are Convolutional Neural Networks? 3 CNN for NLP 4 CNN hyperparameters 5 Example: The Model 6 Bibliography Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 30 / 32

Collobert, Ronan et al. (2011). Natural language processing (almost) from scratch. In: Journal of Machine Learning Research 12.Aug, pp. 2493 2537. Johnson, Rie and Tong Zhang (2015a). Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Denver, Colorado: Association for Computational Linguistics, pp. 103 112. url: http://www.aclweb.org/anthology/n15-1011. (2015b). Semi-supervised convolutional neural networks for text categorization via region embedding. In: Advances in neural information processing systems, pp. 919 927. Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 31 / 32

Kalchbrenner, Nal, Edward Grefenstette, and Phil Blunsom (2014). A Convolutional Neural Network for Modelling Sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Baltimore, Maryland: Association for Computational Linguistics, pp. 655 665. url: http://www.aclweb.org/anthology/p14-1062. Kim, Yoon (2014). Convolutional Neural Networks for Sentence Classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, pp. 1746 1751. url: http://www.aclweb.org/anthology/d14-1181. Kim, Yoon et al. (2016). Character-Aware Neural Language Models. In: Thirtieth AAAI Conference on Artificial Intelligence. Zhang, Xiang, Junbo Zhao, and Yann LeCun (2015). Character-level Convolutional Networks for Text Classification. In: Advanced in Neural Information Processing Systems (NIPS 2015). Vol. 28. Sebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 32 / 32