Deep Architectures for Neural Machine Translation
|
|
- Bernadette Perkins
- 5 years ago
- Views:
Transcription
1 Deep Architectures for Neural Machine Translation Antonio Valerio Miceli Barone Jindřich Helcl Rico Sennrich Barry Haddow Alexandra Birch School of Informatics, University of Edinburgh Faculty of Mathematics and Physics, Charles University September 8, 2017 Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 1 / 16
2 Deep architectures What is the depth of a recurrent neural network? Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 1 / 16
3 Deep architectures What is the depth of a recurrent neural network? [Pascanu et al., 2014] Transition depth Used for LM [Zilly et al., 2016] Stacked depth Baidu [Zhou et al., 2016] Google [Wu et al., 2016] Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 1 / 16
4 Deep architectures What is the depth of a recurrent neural network? [Pascanu et al., 2014] Transition depth Used for LM [Zilly et al., 2016] Stacked depth Baidu [Zhou et al., 2016] Google [Wu et al., 2016] This work Provide a systematic comparison of deep architectures for NMT Investigate the effects of different kinds of depth Propose a combined "BiDeep" architecture Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 1 / 16
5 Deep transition encoder (x i, h i 1,Ls ) GRU 3 GRU 3 GRU 3 h i,k = GRU k (0, ) h i,k 1 h i,1 = GRU 1 GRU 2 GRU 2 GRU 2 for 1 < k L s GRU 1 GRU 1 GRU 1 x 1 x 2 x N Bidirectional encoder Compute the next state using a deep feed-forward network made of multiple GRU transition blocks GRU blocks not individually recurrent, recurrence at time-step level Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 3 / 16
6 Deep transition decoder GRU 3 GRU 3 GRU 3 s j,2 = GRU 2 (ATT(C, s j,1 ), s j,1 ) s j,1 = GRU 1 (y j 1, s j 1,Lt ) GRU 2 GRU 2 GRU 2 ATT ATT ATT s j,k = GRU k (0, s j,k 1 ) for 2 < k L t GRU 1 GRU 1 GRU 1 y 0 y 1 y M Attention mechanism between 1st and 2nd layers (Nematus) Minimum transition depth is 2 even in the baseline Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 4 / 16
7 Alternating stacked encoder x 1 x N Baidu [Zhou et al., 2016] Multiple levels of individually recurrent GRU cells Residual connections between levels Bidirectional: two columns with opposing scanning directions Each level inverts scanning direction of the previous one Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 5 / 16
8 Biunidirectional stacked encoder x 1 x N Google [Wu et al., 2016] Multiple levels of individually recurrent GRU cells Residual connections between levels First levels are bidirectional, then states are merged Higher levels are unidirectional left-to-right Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 6 / 16
9 Stacked decoder y 0 y 1 y M Multiple levels of individually recurrent GRU cells Residual connections between levels Different variations depending on how attention is used in the higher layers (details in the paper) Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 7 / 16
10 BiDeep Architectures Our proposal Combine the two kinds of depth Stacked levels of recurrent cells, each with multiple layers of transition depth Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 8 / 16
11 Experimental setup Data Training: WMT-2017 English-to-German Validation: newstest 2013 Test: newstest (we report averages) System: Nematus [Sennrich et al., 2017b] GRU sequence-to-sequence with attention [Bahdanau et al., 2015] Layer normalization [Ba et al., 2016] Training on single Titan X (Pascal) GPU Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 9 / 16
12 Results: Deep Encoders Bleu shallow ,900 biunidirectional ,500 alternating ,600 deep transition ,900 Test Bleu Train speed 800 1,500 2,200 2,900 3,600 tokens/sec Depth-4 encoders improve translation quality Deep transition fastest and most accurate Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 10 / 16
13 Results: Deep Decoders Bleu shallow ,900 rgru (reuse att.) ,150 cgru (multi att.) 1, deep transition ,200 Test Bleu Training speed 800 1,500 2,200 2,900 3,600 tokens/sec Depth-4 decoders improve translation quality Deep transition fastest and second most accurate Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 11 / 16
14 Results: Deep Encoders and Decoders Bleu deep tran. enc. alternating + rgru 1, , biunidir. + rgru deep tran. enc/dec ,280 1, Test Bleu Training speed BiDeep 1, ,500 2,200 2,900 3,600 tokens/sec Depth-4 on both encoder and decoder yields small improvement Biunidirectional+rGRU [Wu et al., 2016] performs the worst Alternating+rGRU [Zhou et al., 2016] and BiDeep are tied at this depth Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 12 / 16
15 Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 12 / 16
16 Results: Deep Encoders and Decoders (depth 8) Bleu shallow ,900 depth-4 altern. + rgru 1, depth-8 altern. + rgru depth-4 BiDeep 1, depth-8 BiDeep Test Bleu Training speed 800 1,500 2,200 2,900 3,600 tokens/sec Stacked-only plateaus BiDeep keeps improving Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 13 / 16
17 Error analysis Deep transition decoders have a longer information path In principle, might cause fading memory and vanishing gradients Does this affect long-distance dependencies? Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 14 / 16
18 Error analysis Deep transition decoders have a longer information path In principle, might cause fading memory and vanishing gradients Does this affect long-distance dependencies? Lingeval97 subject-verb agreement [Sennrich, 2017] Contrastive evaluation accuracy shallow GRU stacked GRU deep transition GRU BiDeep GRU distance Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 14 / 16
19 Conclusions Findings Depth improves translation BLEU especially in the encoder Alternating stacked encoders outperform Biunidirectional Deep transition encoders performs better or equal BiDeep architectures perform the best We validated these findings on the WMT-17 news translation task Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 15 / 16
20 Conclusions Findings Depth improves translation BLEU especially in the encoder Alternating stacked encoders outperform Biunidirectional Deep transition encoders performs better or equal BiDeep architectures perform the best We validated these findings on the WMT-17 news translation task Recommendations Use deep transition for speed and model size Use BiDeep for maximum quality Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 15 / 16
21 Conclusions Code in the main Nematus repository Scripts and paper: Thanks for your attention Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 16 / 16
22 Conclusions Code in the main Nematus repository Scripts and paper: Thanks for your attention Questions? Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 16 / 16
23 Results: WMT 2017 news translation task ZH EN TR EN RU EN LV EN DE EN CS EN EN ZH EN TR EN RU EN LV EN DE EN CS shallow 20.5 depth Transition depth: Czech: stacked Improvement on all language pairs except Latvian English Bleu Miceli Barone, Helcl, Sennrich, Haddow, Birch Deep Architectures for NMT 16 / 16
Smaller, faster, deeper: University of Edinburgh MT submittion to WMT 2017
Smaller, faster, deeper: University of Edinburgh MT submittion to WMT 2017 Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone, Philip
More informationEdinburgh s Neural Machine Translation Systems
Edinburgh s Neural Machine Translation Systems Barry Haddow University of Edinburgh October 27, 2016 Barry Haddow Edinburgh s NMT Systems 1 / 20 Collaborators Rico Sennrich Alexandra Birch Barry Haddow
More informationarxiv: v2 [cs.cl] 4 Sep 2018
Training Deeper Neural Machine Translation Models with Transparent Attention Ankur Bapna Mia Xu Chen Orhan Firat Yuan Cao ankurbpn,miachen,orhanf,yuancao@google.com Google AI Yonghui Wu arxiv:1808.07561v2
More informationBeam Search Strategies for Neural Machine Translation
Beam Search Strategies for Neural Machine Translation Markus Freitag and Yaser Al-Onaizan IBM T.J. Watson Research Center 1101 Kitchawan Rd, Yorktown Heights, NY 10598 {freitagm,onaizan}@us.ibm.com Abstract
More informationMinimum Risk Training For Neural Machine Translation. Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu
Minimum Risk Training For Neural Machine Translation Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu ACL 2016, Berlin, German, August 2016 Machine Translation MT: using computer
More informationMotivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.
Outline: Motivation. What s the attention mechanism? Soft attention vs. Hard attention. Attention in Machine translation. Attention in Image captioning. State-of-the-art. 1 Motivation: Attention: Focusing
More informationMassive Exploration of Neural Machine Translation Architectures
Massive Exploration of Neural Machine Translation Architectures Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc V. Le {dennybritz,agoldie,thangluong,qvl}@google.com Google Brain Abstract Neural Machine
More informationImage Captioning using Reinforcement Learning. Presentation by: Samarth Gupta
Image Captioning using Reinforcement Learning Presentation by: Samarth Gupta 1 Introduction Summary Supervised Models Image captioning as RL problem Actor Critic Architecture Policy Gradient architecture
More informationWhen to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size)
When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) Liang Huang and Kai Zhao and Mingbo Ma School of Electrical Engineering and Computer Science Oregon State University Corvallis,
More informationNeural Machine Translation with Key-Value Memory-Augmented Attention
Neural Machine Translation with Key-Value Memory-Augmented Attention Fandong Meng, Zhaopeng Tu, Yong Cheng, Haiyang Wu, Junjie Zhai, Yuekui Yang, Di Wang Tencent AI Lab {fandongmeng,zptu,yongcheng,gavinwu,jasonzhai,yuekuiyang,diwang}@tencent.com
More informationAsynchronous Bidirectional Decoding for Neural Machine Translation
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Asynchronous Bidirectional Decoding for Neural Machine Translation Xiangwen Zhang, 1 Jinsong Su, 1 Yue Qin, 1 Yang Liu, 2 Rongrong
More informationAn Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation
An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation Raphael Shu, Hideki Nakayama shu@nlab.ci.i.u-tokyo.ac.jp, nakayama@ci.i.u-tokyo.ac.jp The University of Tokyo In
More informationMachine Translation Contd
Machine Translation Contd Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 March 7, 2017 Based on slides from Richard Socher, Chris Manning, Philipp Koehn, and everyone else they copied from. Upcoming
More informationarxiv: v1 [cs.cl] 16 Jan 2018
Asynchronous Bidirectional Decoding for Neural Machine Translation Xiangwen Zhang 1, Jinsong Su 1, Yue Qin 1, Yang Liu 2, Rongrong Ji 1, Hongi Wang 1 Xiamen University, Xiamen, China 1 Tsinghua University,
More informationEfficient Attention using a Fixed-Size Memory Representation
Efficient Attention using a Fixed-Size Memory Representation Denny Britz and Melody Y. Guan and Minh-Thang Luong Google Brain dennybritz,melodyguan,thangluong@google.com Abstract The standard content-based
More informationDeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation
DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation Biyi Fang Michigan State University ACM SenSys 17 Nov 6 th, 2017 Biyi Fang (MSU) Jillian Co (MSU) Mi Zhang
More informationAuto-Encoder Pre-Training of Segmented-Memory Recurrent Neural Networks
Auto-Encoder Pre-Training of Segmented-Memory Recurrent Neural Networks Stefan Glüge, Ronald Böck and Andreas Wendemuth Faculty of Electrical Engineering and Information Technology Cognitive Systems Group,
More informationarxiv: v1 [stat.ml] 23 Jan 2017
Learning what to look in chest X-rays with a recurrent visual attention model arxiv:1701.06452v1 [stat.ml] 23 Jan 2017 Petros-Pavlos Ypsilantis Department of Biomedical Engineering King s College London
More informationSegmentation of Cell Membrane and Nucleus by Improving Pix2pix
Segmentation of Membrane and Nucleus by Improving Pix2pix Masaya Sato 1, Kazuhiro Hotta 1, Ayako Imanishi 2, Michiyuki Matsuda 2 and Kenta Terai 2 1 Meijo University, Siogamaguchi, Nagoya, Aichi, Japan
More informationLearning Convolutional Neural Networks for Graphs
GA-65449 Learning Convolutional Neural Networks for Graphs Mathias Niepert Mohamed Ahmed Konstantin Kutzkov NEC Laboratories Europe Representation Learning for Graphs Telecom Safety Transportation Industry
More informationConvolutional and LSTM Neural Networks
Convolutional and LSTM Neural Networks Vanessa Jurtz January 11, 2017 Contents Neural networks and GPUs Lasagne Peptide binding to MHC class II molecules Convolutional Neural Networks (CNN) Recurrent and
More informationA Deep Learning Approach to Identify Diabetes
, pp.44-49 http://dx.doi.org/10.14257/astl.2017.145.09 A Deep Learning Approach to Identify Diabetes Sushant Ramesh, Ronnie D. Caytiles* and N.Ch.S.N Iyengar** School of Computer Science and Engineering
More informationSequential Predictions Recurrent Neural Networks
CS 2770: Computer Vision Sequential Predictions Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 One Motivation: Descriptive Text for Images It was an arresting
More informationCSE Introduction to High-Perfomance Deep Learning ImageNet & VGG. Jihyung Kil
CSE 5194.01 - Introduction to High-Perfomance Deep Learning ImageNet & VGG Jihyung Kil ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton,
More informationRecurrent Neural Networks
CS 2750: Machine Learning Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2017 One Motivation: Descriptive Text for Images It was an arresting face, pointed of chin,
More informationInferring Clinical Correlations from EEG Reports with Deep Neural Learning
Inferring Clinical Correlations from EEG Reports with Deep Neural Learning Methods for Identification, Classification, and Association using EHR Data S23 Travis R. Goodwin (Presenter) & Sanda M. Harabagiu
More informationDeep Diabetologist: Learning to Prescribe Hypoglycemia Medications with Hierarchical Recurrent Neural Networks
Deep Diabetologist: Learning to Prescribe Hypoglycemia Medications with Hierarchical Recurrent Neural Networks Jing Mei a, Shiwan Zhao a, Feng Jin a, Eryu Xia a, Haifeng Liu a, Xiang Li a a IBM Research
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 12: Time Sequence Data, Recurrent Neural Networks (RNNs), Long Short-Term Memories (s), and Image Captioning Peter Belhumeur Computer Science Columbia University
More informationY-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images
Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images Sachin Mehta 1, Ezgi Mercan 1, Jamen Bartlett 2, Donald Weaver 2, Joann G. Elmore 1, and Linda Shapiro 1 1 University
More informationImage Classification with TensorFlow: Radiomics 1p/19q Chromosome Status Classification Using Deep Learning
Image Classification with TensorFlow: Radiomics 1p/19q Chromosome Status Classification Using Deep Learning Charles Killam, LP.D. Certified Instructor, NVIDIA Deep Learning Institute NVIDIA Corporation
More informationConvolutional and LSTM Neural Networks
Convolutional and LSTM Neural Networks Vanessa Jurtz January 12, 2016 Contents Neural networks and GPUs Lasagne Peptide binding to MHC class II molecules Convolutional Neural Networks (CNN) Recurrent and
More informationFlexible, High Performance Convolutional Neural Networks for Image Classification
Flexible, High Performance Convolutional Neural Networks for Image Classification Dan C. Cireşan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jürgen Schmidhuber IDSIA, USI and SUPSI Manno-Lugano,
More informationUniformity of the neocortex is the basis for human versa4lity
npo Uniformity of the neocortex is the basis for human versa4lity Canonical cortical circuit Bodily-kinesthetic Verbal-linguistic Logical-mathematical Musical-rhythmic and harmonic Interpersonal Intrapersonal
More informationSkin cancer reorganization and classification with deep neural network
Skin cancer reorganization and classification with deep neural network Hao Chang 1 1. Department of Genetics, Yale University School of Medicine 2. Email: changhao86@gmail.com Abstract As one kind of skin
More informationA HMM-based Pre-training Approach for Sequential Data
A HMM-based Pre-training Approach for Sequential Data Luca Pasa 1, Alberto Testolin 2, Alessandro Sperduti 1 1- Department of Mathematics 2- Department of Developmental Psychology and Socialisation University
More informationCOMP9444 Neural Networks and Deep Learning 5. Convolutional Networks
COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks Textbook, Sections 6.2.2, 6.3, 7.9, 7.11-7.13, 9.1-9.5 COMP9444 17s2 Convolutional Networks 1 Outline Geometry of Hidden Unit Activations
More informationAudiovisual to Sign Language Translator
Technical Disclosure Commons Defensive Publications Series July 17, 2018 Audiovisual to Sign Language Translator Manikandan Gopalakrishnan Follow this and additional works at: https://www.tdcommons.org/dpubs_series
More informationarxiv: v3 [cs.lg] 15 Feb 2019
David R. So 1 Chen Liang 1 Quoc V. Le 1 arxiv:1901.11117v3 [cs.lg] 15 Feb 2019 Abstract Recent works have highlighted the strengths of the Transformer architecture for dealing with sequence tasks. At the
More informationHHS Public Access Author manuscript Med Image Comput Comput Assist Interv. Author manuscript; available in PMC 2018 January 04.
Discriminative Localization in CNNs for Weakly-Supervised Segmentation of Pulmonary Nodules Xinyang Feng 1, Jie Yang 1, Andrew F. Laine 1, and Elsa D. Angelini 1,2 1 Department of Biomedical Engineering,
More informationA Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition
A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition LászlóTóth and Tamás Grósz MTA-SZTE Research Group on Artificial Intelligence Hungarian Academy of Sciences
More informationCS-E Deep Learning Session 4: Convolutional Networks
CS-E4050 - Deep Learning Session 4: Convolutional Networks Jyri Kivinen Aalto University 23 September 2015 Credits: Thanks to Tapani Raiko for slides material. CS-E4050 - Deep Learning Session 4: Convolutional
More informationDeep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations
Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Andy Nguyen, M.D., M.S. Medical Director, Hematopathology, Hematology and Coagulation Laboratory,
More informationarxiv: v1 [cs.lg] 6 Oct 2016
Combining Generative and Discriminative Neural Networks for Sleep Stages Classification Endang Purnama Giri 1,2, Mohamad Ivan Fanany 1, Aniati Murni Arymurthy 1, arxiv:1610.01741v1 [cs.lg] 6 Oct 2016 1
More informationElad Hoffer*, Itay Hubara*, Daniel Soudry
Train longer, generalize better: closing the generalization gap in large batch training of neural networks Elad Hoffer*, Itay Hubara*, Daniel Soudry *Equal contribution Better models - parallelization
More informationDeep Learning for Lip Reading using Audio-Visual Information for Urdu Language
Deep Learning for Lip Reading using Audio-Visual Information for Urdu Language Muhammad Faisal Information Technology University Lahore m.faisal@itu.edu.pk Abstract Human lip-reading is a challenging task.
More informationNeural Response Generation for Customer Service based on Personality Traits
Neural Response Generation for Customer Service based on Personality Traits Jonathan Herzig, Michal Shmueli-Scheuer, Tommy Sandbank and David Konopnicki IBM Research - Haifa Haifa 31905, Israel {hjon,shmueli,tommy,davidko}@il.ibm.com
More informationConvolutional Coding: Fundamentals and Applications. L. H. Charles Lee. Artech House Boston London
Convolutional Coding: Fundamentals and Applications L. H. Charles Lee Artech House Boston London Contents Preface xi Chapter 1 Introduction of Coded Digital Communication Systems 1 1.1 Introduction 1 1.2
More informationPART - A 1. Define Artificial Intelligence formulated by Haugeland. The exciting new effort to make computers think machines with minds in the full and literal sense. 2. Define Artificial Intelligence
More informationCost-aware Pre-training for Multiclass Cost-sensitive Deep Learning
Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan
More informationPackage appnn. R topics documented: July 12, 2015
Package appnn July 12, 2015 Type Package Title Amyloid Propensity Prediction Neural Network Version 1.0-0 Date 2015-07-11 Encoding UTF-8 Author Maintainer Carlos Família Amyloid
More informationDeep Learning Models for Time Series Data Analysis with Applications to Health Care
Deep Learning Models for Time Series Data Analysis with Applications to Health Care Yan Liu Computer Science Department University of Southern California Email: yanliu@usc.edu Yan Liu (USC) Deep Health
More informationSpeech Enhancement Based on Deep Neural Networks
Speech Enhancement Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu and Jun Du at USTC 1 Outline and Talk Agenda In Signal Processing Letter,
More informationIncorporating Word Reordering Knowledge into. attention-based Neural Machine Translation
Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation Jinchao Zhang 1 Mingxuan Wang 1 Qun Liu 3,1 Jie Zhou 2 1 Key Laboratory of Intelligent Information Processing, Institute
More informationPhrase-level Quality Estimation for Machine Translation
Phrase-level Quality Estimation for Machine Translation Varvara Logacheva, Lucia Specia University of Sheffield December 3, 2015 Varvara Logacheva, Lucia Specia Phrase-level Quality Estimation 1/21 Quality
More informationDual Path Network and Its Applications
Learning and Vision Group (NUS), ILSVRC 2017 - CLS-LOC & DET tasks Dual Path Network and Its Applications National University of Singapore: Yunpeng Chen, Jianan Li, Huaxin Xiao, Jianshu Li, Xuecheng Nie,
More informationRecurrent Fully Convolutional Neural Networks for Multi-slice MRI Cardiac Segmentation
Recurrent Fully Convolutional Neural Networks for Multi-slice MRI Cardiac Segmentation Rudra P K Poudel, Pablo Lamata and Giovanni Montana Department of Biomedical Engineering, King s College London, SE1
More informationCorpus Construction and Semantic Analysis of Indonesian Image Description
Corpus Construction and Semantic Analysis of Indonesian Image Description Khumaisa Nur aini 1,3, Johanes Effendi 1, Sakriani Sakti 1,2, Mirna Adriani 3, Sathosi Nakamura 1,2 1 Nara Institute of Science
More informationAcoustic Signal Processing Based on Deep Neural Networks
Acoustic Signal Processing Based on Deep Neural Networks Chin-Hui Lee School of ECE, Georgia Tech chl@ece.gatech.edu Joint work with Yong Xu, Yanhui Tu, Qing Wang, Tian Gao, Jun Du, LiRong Dai Outline
More informationQuality assurance in pharmacotherapy with methadone
Quality assurance in pharmacotherapy with methadone Dr. Emilis Subata Vilnius Centre for Addictive Disorders Psychiatry Clinic of Vilnius University emilis.subata@vplc.lt Outline of the presentation Introduction
More informationECG Signal Classification with Deep Learning Techniques
ECG Signal Classification with Deep Learning Techniques Chien You Huang, B04901147 Ruey Lin Jahn, B02901043 Sung-wei Huang, B04901093 Department of Electrical Engineering, National Taiwan University, Taipei,
More informationarxiv: v2 [cs.cv] 7 Jun 2018
Deep supervision with additional labels for retinal vessel segmentation task Yishuo Zhang and Albert C.S. Chung Lo Kwee-Seong Medical Image Analysis Laboratory, Department of Computer Science and Engineering,
More informationAutomatic Prostate Cancer Classification using Deep Learning. Ida Arvidsson Centre for Mathematical Sciences, Lund University, Sweden
Automatic Prostate Cancer Classification using Deep Learning Ida Arvidsson Centre for Mathematical Sciences, Lund University, Sweden Outline Autoencoders, theory Motivation, background and goal for prostate
More informationarxiv: v1 [cs.cv] 12 Jun 2018
Multiview Two-Task Recursive Attention Model for Left Atrium and Atrial Scars Segmentation Jun Chen* 1, Guang Yang* 2, Zhifan Gao 3, Hao Ni 4, Elsa Angelini 5, Raad Mohiaddin 2, Tom Wong 2,Yanping Zhang
More informationarxiv: v1 [cs.lg] 8 Feb 2016
Predicting Clinical Events by Combining Static and Dynamic Information Using Recurrent Neural Networks Cristóbal Esteban 1, Oliver Staeck 2, Yinchong Yang 1 and Volker Tresp 1 1 Siemens AG and Ludwig Maximilian
More information基于循环神经网络的序列推荐 吴书智能感知与计算研究中心模式识别国家重点实验室中国科学院自动化研究所. Spe 17, 中国科学院自动化研究所 Institute of Automation Chinese Academy of Sciences
模式识别国家重点实验室 National Lab of Pattern Recognition 中国科学院自动化研究所 Institute of Automation Chinese Academy of Sciences 基于循环神经网络的序列推荐 吴书智能感知与计算研究中心模式识别国家重点实验室中国科学院自动化研究所 Spe 17, 2017 报告内容 Recurrent Basket Model
More informationUsing Deep Convolutional Networks for Gesture Recognition in American Sign Language
Using Deep Convolutional Networks for Gesture Recognition in American Sign Language Abstract In the realm of multimodal communication, sign language is, and continues to be, one of the most understudied
More informationImproving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets Zhen Yang 1,2, Wei Chen 1, Feng Wang 1,2, Bo Xu 1 1 Institute of Automation, Chinese Academy of Sciences 2 University
More informationExploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk
Exploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk Michael Denkowski and Alon Lavie Language Technologies Institute School of
More informationLow-Resource Neural Headline Generation
Low-Resource Neural Headline Generation Ottokar Tilk and Tanel Alumäe Department of Software Science, School of Information Technologies, Tallinn University of Technology, Estonia ottokar.tilk@ttu.ee,
More informationEvolutionary Programming
Evolutionary Programming Searching Problem Spaces William Power April 24, 2016 1 Evolutionary Programming Can we solve problems by mi:micing the evolutionary process? Evolutionary programming is a methodology
More informationHow much data is enough? Predicting how accuracy varies with training data size
How much data is enough? Predicting how accuracy varies with training data size Mark Johnson (with Dat Quoc Nguyen) Macquarie University Sydney, Australia September 4, 2017 1 / 33 Outline Introduction
More informationarxiv: v2 [cs.lg] 1 Jun 2018
Shagun Sodhani 1 * Vardaan Pahuja 1 * arxiv:1805.11016v2 [cs.lg] 1 Jun 2018 Abstract Self-play (Sukhbaatar et al., 2017) is an unsupervised training procedure which enables the reinforcement learning agents
More informationCourse Introduction. Neural Information Processing: Introduction. Notes. Administration
3 / 17 4 / 17 Course Introduction Neural Information Processing: Introduction Matthias Hennig and Mark van Rossum School of Informatics, University of Edinburgh Welcome and administration Course outline
More informationB657: Final Project Report Holistically-Nested Edge Detection
B657: Final roject Report Holistically-Nested Edge Detection Mingze Xu & Hanfei Mei May 4, 2016 Abstract Holistically-Nested Edge Detection (HED), which is a novel edge detection method based on fully
More informationLearning in neural networks
http://ccnl.psy.unipd.it Learning in neural networks Marco Zorzi University of Padova M. Zorzi - European Diploma in Cognitive and Brain Sciences, Cognitive modeling", HWK 19-24/3/2006 1 Connectionist
More informationCan Generic Neural Networks Estimate Numerosity Like Humans?
Can Generic Neural Networks Estimate Numerosity Like Humans? Sharon Y. Chen (syc2138@columbia.edu) 1, Zhenglong Zhou (zzhou34@jhu.edu) 2, Mengting Fang (mtfang@mail.bnu.edu.cn) 3, and James L. McClelland
More informationAn Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns
An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns 1. Introduction Vasily Morzhakov, Alexey Redozubov morzhakovva@gmail.com, galdrd@gmail.com Abstract Cortical
More informationBetter Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability
Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability Jonathan H. Clark Chris Dyer Alon Lavie Noah A. Smith Language Technologies Institute Carnegie Mellon
More informationTranslation Quality Assessment: Evaluation and Estimation
Translation Quality Assessment: Evaluation and Estimation Lucia Specia University of Sheffield l.specia@sheffield.ac.uk 9 September 2013 Translation Quality Assessment: Evaluation and Estimation 1 / 33
More informationContext Gates for Neural Machine Translation
Context Gates for Neural Machine Translation Zhaopeng Tu Yang Liu Zhengdong Lu Xiaohua Liu Hang Li Noah s Ark Lab, Huawei Technologies, Hong Kong {tu.zhaopeng,lu.zhengdong,liuxiaohua3,hangli.hl}@huawei.com
More informationFactoid Question Answering
Factoid Question Answering CS 898 Project June 12, 2017 Salman Mohammed David R. Cheriton School of Computer Science University of Waterloo Motivation Source: https://www.apple.com/newsroom/2017/01/hey-siri-whos-going-to-win-the-super-bowl/
More informationClassıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon Network
UTM Computing Proceedings Innovations in Computing Technology and Applications Volume 2 Year: 2017 ISBN: 978-967-0194-95-0 1 Classıfıcatıon of Dıabetes Dısease Usıng Backpropagatıon and Radıal Basıs Functıon
More informationEfficient Deep Model Selection
Efficient Deep Model Selection Jose Alvarez Researcher Data61, CSIRO, Australia GTC, May 9 th 2017 www.josemalvarez.net conv1 conv2 conv3 conv4 conv5 conv6 conv7 conv8 softmax prediction???????? Num Classes
More informationOverview of the Patent Translation Task at the NTCIR-7 Workshop
Overview of the Patent Translation Task at the NTCIR-7 Workshop Atsushi Fujii, Masao Utiyama, Mikio Yamamoto, Takehito Utsuro University of Tsukuba National Institute of Information and Communications
More informationWhy did the network make this prediction?
Why did the network make this prediction? Ankur Taly (Google Inc.) Joint work with Mukund Sundararajan and Qiqi Yan Some Deep Learning Successes Source: https://www.cbsinsights.com Deep Neural Networks
More informationArtificial Neural Networks and Near Infrared Spectroscopy - A case study on protein content in whole wheat grain
A White Paper from FOSS Artificial Neural Networks and Near Infrared Spectroscopy - A case study on protein content in whole wheat grain By Lars Nørgaard*, Martin Lagerholm and Mark Westerhaus, FOSS *corresponding
More informationCoded Code-Division Multiple Access (CDMA) The combination of forward error control (FEC) coding with code-division multiple-access (CDMA).
Coded Code-Division Multiple Access (CDMA) The combination of forward error control (FEC) coding with code-division multiple-access (CDMA). Optimal Processing: Maximum Likehood (ML) decoding applied to
More informationweek by week in the Training Calendar. Follow up graphics, sort, filter, merge, and split functions are also included.
Firstbeat ATHLETE The story Firstbeat ATHLETE is one of the most advanced software tools for analyzing heart rate based training. It makes professional level training analysis available to all dedicated
More informationPatient Subtyping via Time-Aware LSTM Networks
Patient Subtyping via Time-Aware LSTM Networks Inci M. Baytas Computer Science and Engineering Michigan State University 428 S Shaw Ln. East Lansing, MI 48824 baytasin@msu.edu Fei Wang Healthcare Policy
More informationSpeech Processing / Speech Translation Case study: Transtac Details
Speech Processing 11-492/18-492 Speech Translation Case study: Transtac Details Phraselator: One Way Translation Commercial System VoxTec Rapid deployment Modules of 500ish utts Transtac: Two S2S System
More informationRoutine Quality Assurance Cookbook
This Cookbook is a companion guide to the AIUM Routine Quality Assurance (QA) for Diagnostic Ultrasound Equipment document, which outlines the basic QA requirements for AIUM-accredited practices. The Guide
More informationComputational Cognitive Science
Computational Cognitive Science Lecture 15: Visual Attention Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 14 November 2017 1 / 28
More informationClassification and attractiveness evaluation of facial emotions for purposes of plastic surgery using machine-learning methods and R
Classification and attractiveness evaluation of facial emotions for purposes of plastic surgery using machine-learning methods and R erum 2018 Lubomír Štěpánek 1, 2 Pavel Kasal 2 Jan Měšťák 3 1 Institute
More informationToward the Evaluation of Machine Translation Using Patent Information
Toward the Evaluation of Machine Translation Using Patent Information Atsushi Fujii Graduate School of Library, Information and Media Studies University of Tsukuba Mikio Yamamoto Graduate School of Systems
More informationTranslating Videos to Natural Language Using Deep Recurrent Neural Networks
Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan UT Austin Huijuan Xu UMass. Lowell Jeff Donahue UC Berkeley Marcus Rohrbach UC Berkeley Subhashini Venugopalan
More informationNature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.
Supplementary Figure 1 Behavioral training. a, Mazes used for behavioral training. Asterisks indicate reward location. Only some example mazes are shown (for example, right choice and not left choice maze
More informationarxiv: v1 [cs.hc] 16 Nov 2017
39 MindID: Person Identification from Brain Waves through Attention-based Recurrent Neural Network arxiv:1711.06149v1 [cs.hc] 16 Nov 2017 XIANG ZHANG, University of New South Wales LINA YAO, University
More information