Minimum Risk Training For Neural Machine Translation. Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu
|
|
- Lewis Hoover
- 6 years ago
- Views:
Transcription
1 Minimum Risk Training For Neural Machine Translation Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu ACL 2016, Berlin, German, August 2016
2 Machine Translation MT: using computer to translate natural languages Bush held a talk with Sharon 2
3 Our Work A new training criterion for NMT Eliminating the discrepancy between training and testing Significant improvement on NIST03 dataset: System Training BLEU TER SMT Moses MERT NMT RNNSearch MLE MRT (Koehn and Hoang, 2007; Bandana et al., 2015) 3
4 Outline Introduction to Neural MT Maximum Likelihood Estimation Minimum Risk Training Experiments Conclusion 4
5 Modeling Key problem: how to model the translation process? 5
6 Modeling SMT: describing the translation process via latent structures (Brown et al., 1993) 6
7 Modeling NMT: describing the translation process via neural networks (Kalchbrenner and Blunsom, 2013; Sutskever et al., 2014; Bandana et al., 2015) 7
8 Attentional NMT Bush held a talk with Sharon (Bandana et al., 2015) 8
9 Outline Introduction to Neural MT Maximum Likelihood Estimation Minimum Risk Training Experiments Conclusion 9
10 Maximum Likelihood Estimation MLE is the standard training criterion for NMT training data: objective: optimization: {hx (s), y (s) i} S s=1 SX L( ) = log P (y (s) x (s) ; ) = s=1 SX s=1 N (s) X n=1 nl( ) o ˆ MLE = argmax log P (y n (s) x (s), y <n; (s) ) (Kalchbrenner and Blunsom, 2013; Sutskever et al., 2014; Bandana et al., 2015) 10
11 Drawbacks word-level loss function (Ranzato et al., 2015) 11
12 Drawbacks word-level loss function exposure bias training testing generating target words are based on training data generating target words are based on model predictions (Ranzato et al., 2015) 12
13 Outline Introduction to Neural MT Maximum Likelihood Estimation Minimum Risk Training Experiments Conclusion 13
14 Minimum Risk Training MRT aims to minimize expected loss on training data training data: objective: optimization: {hx (s), y (s) i} S s=1 SX X J ( ) = = s=1 SX s=1 y2y(x (s) ) E y x (s) ; nj o ˆ MRT = argmin ( ) h P (y x (s) ; ) (y, y (s) ) i (y, y (s) ) (Och, 2003; Smith and Eisner, 2006; He and Deng, 2012) 14
15 Challenge It is intractable to calculate partial ( i = SX s=1 E y x (s) ; " (y, y (s) ) N X (s) (y n x (s), y <n ; )/@ i P (y n x (s), y <n ; ) # the search space is exponential the loss function is usually non-decomposable Hard to design efficient DP algorithms (Och, 2003; Smith and Eisner, 2006; He and Deng, 2012) 15
16 Approximation We approximate the true distribution with a sampled space J ( ) = SX X P (y x (s) ; ) Py 0 2S(x (s) ) P (y0 x (s) ; ) (y, y(s) ) s=1 y2s(x (s) ) = = SX s=1 SX s=1 X y2s(x (s) ) E y x (s) ;, Q(y x (s) ;, ) (y, y (s) ) h i (y, y (s) ) (Och, 2003) 16
17 Advantages of MRT Directly optimize with respect to evaluation metrics better correlation with the final objective Allow arbitrary loss functions both decomposable and non-decomposable Allow arbitrary end-to-end architectures any neural MT models other NLP tasks 17
18 Outline Introduction to Neural MT Maximum Likelihood Estimation Minimum Risk Training Experiments Conclusion 18
19 Setup Language pairs ZH-EN: 2.56M sentence pairs (67.5M+74.8M words) EN-FR: 12M sentence pairs (348M+304M words) EN-DE: 4M sentence pairs (91M+87M words) Evaluation: BLEU, TER, Subjective evaluation 19
20 Effect of Loss Functions criterion loss BLEU TER NIST MLE N/A sbleu MRT ster snist Effect of loss functions on the Chinese-English validation set 20
21 Effect of Effect of α on the Chinese-English validation set 21
22 Chinese-English Translation Moses RNNSearch+MLE RNNSearch+MRT compared to Moses: up to +8.6 points compared to MLE: up to +7.2 points Nist06(Dev) Nist02 Nist03 Nist04 Nist05 Nist08 Evaluation: case-insensitive BLEU 22
23 Chinese-English Translation 70 Moses RNNSearch+MLE RNNSearch+MRT compared to Moses: up to points compared to MLE: up to -8.3 points Nist06(Dev) Nist02 Nist03 Nist04 Nist05 Nist08 Evaluation: case-insensitive TER 23
24 Subjective Evaluation 60 MLE Vs. MRT Worse Equal Better The two human evaluators made close judgements: around 54% of MLE translations are worse than MRT, 23% are equal, and 23% are better. 24
25 Example Input Reference the u.s. delegation includes a china expert from stanford university, two senate foreign policy aides and a former state department official in charge of dealing with pyongyang authorities 25
26 Example Input Moses the united states to members of the delegation include representatives from the stanford university, a chinese expert, two assistant senate foreign policy and a responsible for dealing with pyongyang before the officials of the state council. 26
27 Example Input RNNSearch (MLE) the us delegation comprises a chinese expert from stanford university, a chinese foreign office assistant policy assistant and a former official who is responsible for dealing with the pyongyang authorities. 27
28 Example Input RNNSearch (MRT) the us delegation included a chinese expert from the stanford university, two senate foreign policy assistants, and a former state department official who had dealings with the pyongyang authorities. 28
29 English-French Translation System Architecture Training Vocab BLEU Bahdanau et al. (2015) gated RNN with search 30K 28.5 Jean et al. (2015) gated RNN with search 30K 30.0 Jean et al. (2015) gated RNN with search + PosUnk 30K 33.1 Sutskever et al. (2014) LSTM with 4 layers 80K 30.6 MLE Luong et al. (2015) LSTM with 4 layers 40K 29.5 Luong et al. (2015) LSTM with 4 layers + PosUnk 40K 31.8 Luong et al. (2015) LSTM with 6 layers 40K 30.4 Luong et al. (2015) LSTM with 6 layers + PosUnk 40K 32.7 gated RNN with search MLE 30K 29.9 this work gated RNN with search MRT 30K 31.3 gated RNN with search + PosUnk MRT 30K 34.2 Dev set: news-test 2012 & 2013, Test set: news-test 2014 Evaluation: case-sensitive BLEU 29
30 English-German Translation System Architecture Training BLEU Jean et al. (2015) gated RNN with search 16.5 Jean et al. (2015) gated RNN with search + PosUnk 19.0 MLE Jean et al. (2015) gated RNN with search + LV + PosUnk 19.4 Luong et al. (2015b) LSTM w/ 4 layers + dropout + local att. + PosUnk 20.9 gated RNN with search MLE 16.5 this work gated RNN with search MRT 18.0 gated RNN with search + PosUnk MRT 20.5 Dev set: news-test 2012 & 2013, Test set: news-test 2014 Evaluation: case-sensitive BLEU 30
31 Outline Introduction to Neural MT Maximum Likelihood Estimation Minimum Risk Training Experiments Conclusion 31
32 Conclusion Neural MT has become increasingly hot in the recent two years Conventional maximum likelihood estimation (MLE) suffers from some drawbacks Minimum risk training (MRT) significantly improves NMT over MLE MRT can be applied to other end-to-end architectures for NLP tasks 32
33 Thank you! 33
When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size)
When to Finish? Optimal Beam Search for Neural Text Generation (modulo beam size) Liang Huang and Kai Zhao and Mingbo Ma School of Electrical Engineering and Computer Science Oregon State University Corvallis,
More informationBeam Search Strategies for Neural Machine Translation
Beam Search Strategies for Neural Machine Translation Markus Freitag and Yaser Al-Onaizan IBM T.J. Watson Research Center 1101 Kitchawan Rd, Yorktown Heights, NY 10598 {freitagm,onaizan}@us.ibm.com Abstract
More informationAsynchronous Bidirectional Decoding for Neural Machine Translation
The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18) Asynchronous Bidirectional Decoding for Neural Machine Translation Xiangwen Zhang, 1 Jinsong Su, 1 Yue Qin, 1 Yang Liu, 2 Rongrong
More informationDeep Architectures for Neural Machine Translation
Deep Architectures for Neural Machine Translation Antonio Valerio Miceli Barone Jindřich Helcl Rico Sennrich Barry Haddow Alexandra Birch School of Informatics, University of Edinburgh Faculty of Mathematics
More informationarxiv: v4 [cs.cl] 30 Sep 2018
Adversarial Neural Machine Translation arxiv:1704.06933v4 [cs.cl] 30 Sep 2018 Lijun Wu 1, Yingce Xia 2, Li Zhao 3, Fei Tian 3, Tao Qin 3, Jianhuang Lai 1,4 and Tie-Yan Liu 3 1 School of Data and Computer
More informationarxiv: v1 [cs.cl] 17 Oct 2016
Interactive Attention for Neural Machine Translation Fandong Meng 1 Zhengdong Lu 2 Hang Li 2 Qun Liu 3,4 arxiv:1610.05011v1 [cs.cl] 17 Oct 2016 1 AI Platform Department, Tencent Technology Co., Ltd. fandongmeng@tencent.com
More informationarxiv: v1 [cs.cl] 16 Jan 2018
Asynchronous Bidirectional Decoding for Neural Machine Translation Xiangwen Zhang 1, Jinsong Su 1, Yue Qin 1, Yang Liu 2, Rongrong Ji 1, Hongi Wang 1 Xiamen University, Xiamen, China 1 Tsinghua University,
More informationA GRU-Gated Attention Model for Neural Machine Translation
A GRU-Gated Attention Model for Neural Machine Translation Biao Zhang 1, Deyi Xiong 2 and Jinsong Su 1 Xiamen University, Xiamen, China 361005 1 Soochow University, Suzhou, China 215006 2 zb@stu.xmu.edu.cn,
More informationContext Gates for Neural Machine Translation
Context Gates for Neural Machine Translation Zhaopeng Tu Yang Liu Zhengdong Lu Xiaohua Liu Hang Li Noah s Ark Lab, Huawei Technologies, Hong Kong {tu.zhaopeng,lu.zhengdong,liuxiaohua3,hangli.hl}@huawei.com
More informationAdversarial Neural Machine Translation
Proceedings of Machine Learning Research 95:534-549, 2018 ACML 2018 Adversarial Neural Machine Translation Lijun Wu Sun Yat-sen University Yingce Xia University of Science and Technology of China Fei Tian
More informationIncorporating Word Reordering Knowledge into. attention-based Neural Machine Translation
Incorporating Word Reordering Knowledge into Attention-based Neural Machine Translation Jinchao Zhang 1 Mingxuan Wang 1 Qun Liu 3,1 Jie Zhou 2 1 Key Laboratory of Intelligent Information Processing, Institute
More informationSmaller, faster, deeper: University of Edinburgh MT submittion to WMT 2017
Smaller, faster, deeper: University of Edinburgh MT submittion to WMT 2017 Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone, Philip
More informationImage Captioning using Reinforcement Learning. Presentation by: Samarth Gupta
Image Captioning using Reinforcement Learning Presentation by: Samarth Gupta 1 Introduction Summary Supervised Models Image captioning as RL problem Actor Critic Architecture Policy Gradient architecture
More informationMachine Translation Contd
Machine Translation Contd Prof. Sameer Singh CS 295: STATISTICAL NLP WINTER 2017 March 7, 2017 Based on slides from Richard Socher, Chris Manning, Philipp Koehn, and everyone else they copied from. Upcoming
More informationUnsupervised Measurement of Translation Quality Using Multi-engine, Bi-directional Translation
Unsupervised Measurement of Translation Quality Using Multi-engine, Bi-directional Translation Menno van Zaanen and Simon Zwarts Division of Information and Communication Sciences Department of Computing
More informationLanguage to Logical Form with Neural Attention
Language to Logical Form with Neural Attention August 8, 2016 Li Dong and Mirella Lapata Semantic Parsing Transform natural language to logical form Human friendly -> computer friendly What is the highest
More informationBuilding Evaluation Scales for NLP using Item Response Theory
Building Evaluation Scales for NLP using Item Response Theory John Lalor CICS, UMass Amherst Joint work with Hao Wu (BC) and Hong Yu (UMMS) Motivation Evaluation metrics for NLP have been mostly unchanged
More informationImproving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets
Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets Zhen Yang 1,2, Wei Chen 1, Feng Wang 1,2, Bo Xu 1 1 Institute of Automation, Chinese Academy of Sciences 2 University
More informationEdinburgh s Neural Machine Translation Systems
Edinburgh s Neural Machine Translation Systems Barry Haddow University of Edinburgh October 27, 2016 Barry Haddow Edinburgh s NMT Systems 1 / 20 Collaborators Rico Sennrich Alexandra Birch Barry Haddow
More informationBetter Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability
Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability Jonathan H. Clark Chris Dyer Alon Lavie Noah A. Smith Language Technologies Institute Carnegie Mellon
More informationAn Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation
An Empirical Study of Adequate Vision Span for Attention-Based Neural Machine Translation Raphael Shu, Hideki Nakayama shu@nlab.ci.i.u-tokyo.ac.jp, nakayama@ci.i.u-tokyo.ac.jp The University of Tokyo In
More informationUnpaired Image Captioning by Language Pivoting
Unpaired Image Captioning by Language Pivoting Jiuxiang Gu 1, Shafiq Joty 2, Jianfei Cai 2, Gang Wang 3 1 ROSE Lab, Nanyang Technological University, Singapore 2 SCSE, Nanyang Technological University,
More informationConvolutional Neural Networks for Text Classification
Convolutional Neural Networks for Text Classification Sebastian Sierra MindLab Research Group July 1, 2016 ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 1 / 32 Outline 1 What is
More informationA HMM-based Pre-training Approach for Sequential Data
A HMM-based Pre-training Approach for Sequential Data Luca Pasa 1, Alberto Testolin 2, Alessandro Sperduti 1 1- Department of Mathematics 2- Department of Developmental Psychology and Socialisation University
More informationMedical Knowledge Attention Enhanced Neural Model. for Named Entity Recognition in Chinese EMR
Medical Knowledge Attention Enhanced Neural Model for Named Entity Recognition in Chinese EMR Zhichang Zhang, Yu Zhang, Tong Zhou College of Computer Science and Engineering, Northwest Normal University,
More informationCorpus Construction and Semantic Analysis of Indonesian Image Description
Corpus Construction and Semantic Analysis of Indonesian Image Description Khumaisa Nur aini 1,3, Johanes Effendi 1, Sakriani Sakti 1,2, Mirna Adriani 3, Sathosi Nakamura 1,2 1 Nara Institute of Science
More informationNeural Machine Translation with Key-Value Memory-Augmented Attention
Neural Machine Translation with Key-Value Memory-Augmented Attention Fandong Meng, Zhaopeng Tu, Yong Cheng, Haiyang Wu, Junjie Zhai, Yuekui Yang, Di Wang Tencent AI Lab {fandongmeng,zptu,yongcheng,gavinwu,jasonzhai,yuekuiyang,diwang}@tencent.com
More informationHow much data is enough? Predicting how accuracy varies with training data size
How much data is enough? Predicting how accuracy varies with training data size Mark Johnson (with Dat Quoc Nguyen) Macquarie University Sydney, Australia September 4, 2017 1 / 33 Outline Introduction
More informationExploiting Pre-Ordering for Neural Machine Translation
Exploiting Pre-Ordering for Neural Machine Translation Yang Zhao, Jiajun Zhang and Chengqing Zong National Laboratory of Pattern Recognition, Institute of Automation, CAS University of Chinese Academy
More informationIntelligent Machines That Act Rationally. Hang Li Bytedance AI Lab
Intelligent Machines That Act Rationally Hang Li Bytedance AI Lab Four Definitions of Artificial Intelligence Building intelligent machines (i.e., intelligent computers) Thinking humanly Acting humanly
More informationInferring Clinical Correlations from EEG Reports with Deep Neural Learning
Inferring Clinical Correlations from EEG Reports with Deep Neural Learning Methods for Identification, Classification, and Association using EHR Data S23 Travis R. Goodwin (Presenter) & Sanda M. Harabagiu
More informationNeural Response Generation for Customer Service based on Personality Traits
Neural Response Generation for Customer Service based on Personality Traits Jonathan Herzig, Michal Shmueli-Scheuer, Tommy Sandbank and David Konopnicki IBM Research - Haifa Haifa 31905, Israel {hjon,shmueli,tommy,davidko}@il.ibm.com
More informationDeep Diabetologist: Learning to Prescribe Hypoglycemia Medications with Hierarchical Recurrent Neural Networks
Deep Diabetologist: Learning to Prescribe Hypoglycemia Medications with Hierarchical Recurrent Neural Networks Jing Mei a, Shiwan Zhao a, Feng Jin a, Eryu Xia a, Haifeng Liu a, Xiang Li a a IBM Research
More informationIntelligent Machines That Act Rationally. Hang Li Toutiao AI Lab
Intelligent Machines That Act Rationally Hang Li Toutiao AI Lab Four Definitions of Artificial Intelligence Building intelligent machines (i.e., intelligent computers) Thinking humanly Acting humanly Thinking
More informationToward Univeral Network-based Speech Translation
Toward Univeral Network-based Speech Translation Chai Wutiwiwatchai Speech and Audio Technology Laboratory National Electronics and Computer Technology Center 1 Outline Technology Review U-STAR Consortium
More informationTranslation Quality Assessment: Evaluation and Estimation
Translation Quality Assessment: Evaluation and Estimation Lucia Specia University of Sheffield l.specia@sheffield.ac.uk 9 September 2013 Translation Quality Assessment: Evaluation and Estimation 1 / 33
More informationDeep Learning based Information Extraction Framework on Chinese Electronic Health Records
Deep Learning based Information Extraction Framework on Chinese Electronic Health Records Bing Tian Yong Zhang Kaixin Liu Chunxiao Xing RIIT, Beijing National Research Center for Information Science and
More informationMotivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.
Outline: Motivation. What s the attention mechanism? Soft attention vs. Hard attention. Attention in Machine translation. Attention in Image captioning. State-of-the-art. 1 Motivation: Attention: Focusing
More informationExploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk
Exploring Normalization Techniques for Human Judgments of Machine Translation Adequacy Collected Using Amazon Mechanical Turk Michael Denkowski and Alon Lavie Language Technologies Institute School of
More informationMassive Exploration of Neural Machine Translation Architectures
Massive Exploration of Neural Machine Translation Architectures Denny Britz, Anna Goldie, Minh-Thang Luong, Quoc V. Le {dennybritz,agoldie,thangluong,qvl}@google.com Google Brain Abstract Neural Machine
More informationRumor Detection on Twitter with Tree-structured Recursive Neural Networks
1 Rumor Detection on Twitter with Tree-structured Recursive Neural Networks Jing Ma 1, Wei Gao 2, Kam-Fai Wong 1,3 1 The Chinese University of Hong Kong 2 Victoria University of Wellington, New Zealand
More informationSign Language MT. Sara Morrissey
Sign Language MT Sara Morrissey Introduction Overview Irish Sign Language Problems for SLMT SLMT Data MaTrEx for SLs Future work Introduction (1) Motivation SLs are poorly resourced and lack political,
More informationConvolutional and LSTM Neural Networks
Convolutional and LSTM Neural Networks Vanessa Jurtz January 11, 2017 Contents Neural networks and GPUs Lasagne Peptide binding to MHC class II molecules Convolutional Neural Networks (CNN) Recurrent and
More informationOverview of the visual cortex. Ventral pathway. Overview of the visual cortex
Overview of the visual cortex Two streams: Ventral What : V1,V2, V4, IT, form recognition and object representation Dorsal Where : V1,V2, MT, MST, LIP, VIP, 7a: motion, location, control of eyes and arms
More informationPOC Brain Tumor Segmentation. vlife Use Case
Brain Tumor Segmentation vlife Use Case 1 Automatic Brain Tumor Segmentation using CNN Background Brain tumor segmentation seeks to separate healthy tissue from tumorous regions such as the advancing tumor,
More informationDeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation
DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation Biyi Fang Michigan State University ACM SenSys 17 Nov 6 th, 2017 Biyi Fang (MSU) Jillian Co (MSU) Mi Zhang
More informationSocial Image Captioning: Exploring Visual Attention and User Attention
sensors Article Social Image Captioning: Exploring and User Leiquan Wang 1 ID, Xiaoliang Chu 1, Weishan Zhang 1, Yiwei Wei 1, Weichen Sun 2,3 and Chunlei Wu 1, * 1 College of Computer & Communication Engineering,
More informationRecurrent Neural Networks
CS 2750: Machine Learning Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2017 One Motivation: Descriptive Text for Images It was an arresting face, pointed of chin,
More informationKai-Wei Chang UCLA. What It Takes to Control Societal Bias in Natural Language Processing. References:
What It Takes to Control Societal Bias in Natural Language Processing Kai-Wei Chang UCLA References: http://kwchang.net Kai-Wei Chang (kwchang.net/talks/sp.html) 1 A father and son get in a car crash and
More informationConvolutional and LSTM Neural Networks
Convolutional and LSTM Neural Networks Vanessa Jurtz January 12, 2016 Contents Neural networks and GPUs Lasagne Peptide binding to MHC class II molecules Convolutional Neural Networks (CNN) Recurrent and
More informationConnecting Distant Entities with Induction through Conditional Random Fields for Named Entity Recognition: Precursor-Induced CRF
Connecting Distant Entities with Induction through Conditional Random Fields for Named Entity Recognition: Precursor-Induced Wangjin Lee 1 and Jinwook Choi 1,2,3 * 1 Interdisciplinary Program for Bioengineering,
More informationFactoid Question Answering
Factoid Question Answering CS 898 Project June 12, 2017 Salman Mohammed David R. Cheriton School of Computer Science University of Waterloo Motivation Source: https://www.apple.com/newsroom/2017/01/hey-siri-whos-going-to-win-the-super-bowl/
More informationAnalyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length
Analyzing Optimization for Statistical Machine Translation: MERT Learns Verbosity, PRO Learns Length Francisco Guzmán Preslav Nakov and Stephan Vogel ALT Research Group Qatar Computing Research Institute,
More informationarxiv: v1 [stat.ml] 23 Jan 2017
Learning what to look in chest X-rays with a recurrent visual attention model arxiv:1701.06452v1 [stat.ml] 23 Jan 2017 Petros-Pavlos Ypsilantis Department of Biomedical Engineering King s College London
More informationAudiovisual to Sign Language Translator
Technical Disclosure Commons Defensive Publications Series July 17, 2018 Audiovisual to Sign Language Translator Manikandan Gopalakrishnan Follow this and additional works at: https://www.tdcommons.org/dpubs_series
More informationLanguage Services 2009 Special Olympics World Winter Games
Language Services 2009 Special Olympics World Winter Games Functional Goal To ensure that official constituents can communicate and participate during the 2009 Special Olympics World Winter Games. Language
More informationAttend and Diagnose: Clinical Time Series Analysis using Attention Models
Attend and Diagnose: Clinical Time Series Analysis using Attention Models Huan Song, Deepta Rajan, Jayaraman J. Thiagarajan, Andreas Spanias SenSIP Center, School of ECEE, Arizona State University, Tempe,
More informationSITIS-ISPED in CLEF ehealth 2018 Task 1 : ICD10 coding using Deep Learning
SITIS-ISPED in CLEF ehealth 2018 Task 1 : ICD10 coding using Deep Learning Kévin Réby 1,2, Sébastien Cossin 1,2, Georgeta Bordea 1, and Gayo Diallo 1 1 Univ. Bordeaux, INSERM, Bordeaux Population Health
More informationExploiting Patent Information for the Evaluation of Machine Translation
Exploiting Patent Information for the Evaluation of Machine Translation Atsushi Fujii University of Tsukuba Masao Utiyama National Institute of Information and Communications Technology Mikio Yamamoto
More informationarxiv: v3 [cs.lg] 15 Feb 2019
David R. So 1 Chen Liang 1 Quoc V. Le 1 arxiv:1901.11117v3 [cs.lg] 15 Feb 2019 Abstract Recent works have highlighted the strengths of the Transformer architecture for dealing with sequence tasks. At the
More informationAttention Correctness in Neural Image Captioning
Attention Correctness in Neural Image Captioning Chenxi Liu 1 Junhua Mao 2 Fei Sha 2,3 Alan Yuille 1,2 Johns Hopkins University 1 University of California, Los Angeles 2 University of Southern California
More informationVector Learning for Cross Domain Representations
Vector Learning for Cross Domain Representations Shagan Sah, Chi Zhang, Thang Nguyen, Dheeraj Kumar Peri, Ameya Shringi, Raymond Ptucha Rochester Institute of Technology, Rochester, NY 14623, USA arxiv:1809.10312v1
More informationReplacing IBS with IBD: The MLS Method. Biostatistics 666 Lecture 15
Replacing IBS with IBD: The MLS Method Biostatistics 666 Lecture 5 Previous Lecture Analysis of Affected Relative Pairs Test for Increased Sharing at Marker Expected Amount of IBS Sharing Previous Lecture:
More informationTarget-to-distractor similarity can help visual search performance
Target-to-distractor similarity can help visual search performance Vencislav Popov (vencislav.popov@gmail.com) Lynne Reder (reder@cmu.edu) Department of Psychology, Carnegie Mellon University, Pittsburgh,
More informationSpeech Processing / Speech Translation Case study: Transtac Details
Speech Processing 11-492/18-492 Speech Translation Case study: Transtac Details Phraselator: One Way Translation Commercial System VoxTec Rapid deployment Modules of 500ish utts Transtac: Two S2S System
More informationarxiv: v1 [cs.cv] 12 Dec 2016
Text-guided Attention Model for Image Captioning Jonghwan Mun, Minsu Cho, Bohyung Han Department of Computer Science and Engineering, POSTECH, Korea {choco1916, mscho, bhhan}@postech.ac.kr arxiv:1612.03557v1
More informationMachine Learning Models for Blood Glucose Level Prediction
Machine Learning Models for Blood Glucose Level Prediction Sadegh M. Hui Shen Cindy Marling Matt Wiley Franck Schwarz Nigel Struble Jay Shubrook Lijie Xia Razvan Bunescu School of EECS & Diabetes Institute,
More informationTranslating Videos to Natural Language Using Deep Recurrent Neural Networks
Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan UT Austin Huijuan Xu UMass. Lowell Jeff Donahue UC Berkeley Marcus Rohrbach UC Berkeley Subhashini Venugopalan
More informationEfficient Attention using a Fixed-Size Memory Representation
Efficient Attention using a Fixed-Size Memory Representation Denny Britz and Melody Y. Guan and Minh-Thang Luong Google Brain dennybritz,melodyguan,thangluong@google.com Abstract The standard content-based
More informationCost-aware Pre-training for Multiclass Cost-sensitive Deep Learning
Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan
More informationTranslation Quality Assessment: Evaluation and Estimation
Translation Quality Assessment: Evaluation and Estimation Lucia Specia University of Sheffield l.specia@sheffield.ac.uk MTM - Prague, 12 September 2016 Translation Quality Assessment: Evaluation and Estimation
More informationDifferential Attention for Visual Question Answering
Differential Attention for Visual Question Answering Badri Patro and Vinay P. Namboodiri IIT Kanpur { badri,vinaypn }@iitk.ac.in Abstract In this paper we aim to answer questions based on images when provided
More informationUsing Information From the Target Language to Improve Crosslingual Text Classification
Using Information From the Target Language to Improve Crosslingual Text Classification Gabriela Ramírez 1, Manuel Montes 1, Luis Villaseñor 1, David Pinto 2 and Thamar Solorio 3 1 Laboratory of Language
More informationPatient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record
Date of publication 10, 2018, date of current version 10, 2018. Digital Object Identifier 10.1109/ACCESS.2018.2875677 arxiv:1810.04793v3 [q-bio.qm] 25 Oct 2018 Patient2Vec: A Personalized Interpretable
More informationWhy did the network make this prediction?
Why did the network make this prediction? Ankur Taly (Google Inc.) Joint work with Mukund Sundararajan and Qiqi Yan Some Deep Learning Successes Source: https://www.cbsinsights.com Deep Neural Networks
More informationChapter 5: Producing Data
Chapter 5: Producing Data Key Vocabulary: observational study vs. experiment confounded variables population vs. sample sampling vs. census sample design voluntary response sampling convenience sampling
More informationAn Analysis on the Emotion in the Field of Translator's Subjectivity. Wei Yuehong1, a
International Conference on Education, E-learning and Management Technology (EEMT 2016) An Analysis on the Emotion in the Field of Translator's Subjectivity Wei Yuehong1, a Department of English, North
More informationPersonalized Effect of Health Behavior on Blood Pressure: Machine Learning Based Prediction and Recommendation
Personalized Effect of Health Behavior on Blood Pressure: Machine Learning Based Prediction and Recommendation Po-Han Chiang and Sujit Dey Mobile Systems Design Lab, Dept. of Electrical and Computer Engineering,
More information- - Xiaofen Xing, Bolun Cai, Yinhu Zhao, Shuzhen Li, Zhiwei He, Weiquan Fan South China University of Technology
- - - - -- Xiaofen Xing, Bolun Cai, Yinhu Zhao, Shuzhen Li, Zhiwei He, Weiquan Fan South China University of Technology 1 Outline Ø Introduction Ø Feature Extraction Ø Multi-modal Hierarchical Recall Framework
More informationKALAKA-3: a database for the recognition of spoken European languages on YouTube audios
KALAKA3: a database for the recognition of spoken European languages on YouTube audios Luis Javier RodríguezFuentes, Mikel Penagarikano, Amparo Varona, Mireia Diez, Germán Bordel Grupo de Trabajo en Tecnologías
More informationIdentifying the Zygosity Status of Twins Using Bayes Network and Estimation- Maximization Methodology
Identifying the Zygosity Status of Twins Using Bayes Network and Estimation- Maximization Methodology Yicun Ni (ID#: 9064804041), Jin Ruan (ID#: 9070059457), Ying Zhang (ID#: 9070063723) Abstract As the
More informationA Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition
A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition LászlóTóth and Tamás Grósz MTA-SZTE Research Group on Artificial Intelligence Hungarian Academy of Sciences
More informationLocal Monotonic Attention Mechanism for End-to-End Speech and Language Processing
Local Monotonic Attention Mechanism for End-to-End Speech and Language Processing Andros Tjandra, Sakriani Sakti, and Satoshi Nakamura Graduate School of Information Science Nara Institute of Science and
More informationDilated Recurrent Neural Network for Short-Time Prediction of Glucose Concentration
Dilated Recurrent Neural Network for Short-Time Prediction of Glucose Concentration Jianwei Chen, Kezhi Li, Pau Herrero, Taiyu Zhu, Pantelis Georgiou Department of Electronic and Electrical Engineering,
More informationFilippo Chiarello, Andrea Bonaccorsi, Gualtiero Fantoni, Giacomo Ossola, Andrea Cimino and Felice Dell Orletta
Technical Sentiment Analysis Measuring Advantages and Drawbacks of New Products Using Social Media Filippo Chiarello, Andrea Bonaccorsi, Gualtiero Fantoni, Giacomo Ossola, Andrea Cimino and Felice Dell
More informationarxiv: v3 [stat.ml] 27 Mar 2018
ATTACKING THE MADRY DEFENSE MODEL WITH L 1 -BASED ADVERSARIAL EXAMPLES Yash Sharma 1 and Pin-Yu Chen 2 1 The Cooper Union, New York, NY 10003, USA 2 IBM Research, Yorktown Heights, NY 10598, USA sharma2@cooper.edu,
More informationStudy of Translation Edit Rate with Targeted Human Annotation
Study of Translation Edit Rate with Targeted Human Annotation Matthew Snover (UMD) Bonnie Dorr (UMD) Richard Schwartz (BBN) Linnea Micciulla (BBN) John Makhoul (BBN) Outline Motivations Definition of Translation
More informationBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets
BLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets Michel Galley 1 Chris Brockett 1 Alessandro Sordoni 2 Yangfeng Ji 3 Michael Auli 4 Chris Quirk 1 Margaret Mitchell
More informationCOMP9444 Neural Networks and Deep Learning 5. Convolutional Networks
COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks Textbook, Sections 6.2.2, 6.3, 7.9, 7.11-7.13, 9.1-9.5 COMP9444 17s2 Convolutional Networks 1 Outline Geometry of Hidden Unit Activations
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 12: Time Sequence Data, Recurrent Neural Networks (RNNs), Long Short-Term Memories (s), and Image Captioning Peter Belhumeur Computer Science Columbia University
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Midterm, 2016 Exam policy: This exam allows one one-page, two-sided cheat sheet; No other materials. Time: 80 minutes. Be sure to write your name and
More informationarxiv: v1 [cs.cl] 15 Aug 2017
Identifying Harm Events in Clinical Care through Medical Narratives arxiv:1708.04681v1 [cs.cl] 15 Aug 2017 Arman Cohan Information Retrieval Lab, Dept. of Computer Science Georgetown University arman@ir.cs.georgetown.edu
More informationToward the Evaluation of Machine Translation Using Patent Information
Toward the Evaluation of Machine Translation Using Patent Information Atsushi Fujii Graduate School of Library, Information and Media Studies University of Tsukuba Mikio Yamamoto Graduate School of Systems
More informationWAVERLY LABS - Press Kit
Press kit Summary About Pilot Waverly Labs Core Team Pilot Translating Earpiece FAQ Pilot Speech Translator App FAQ Testimonials Contact Useful Links p. 3 p.4 p. 5 p. 6 p. 8 p. 10 p. 12 p. 14 p. 16 p.
More informationSummary. About Pilot Waverly Labs is... Core Team Pilot Translation Kit FAQ Pilot Speech Translation App FAQ Testimonials Contact Useful Links. p.
Press kit Summary About Pilot Waverly Labs is... Core Team Pilot Translation Kit FAQ Pilot Speech Translation App FAQ Testimonials Contact Useful Links p. 3 p.4 p. 5 p. 6 p. 8 p. 10 p. 12 p. 14 p. 16 p.
More informationarxiv: v1 [cs.lg] 8 Feb 2016
Predicting Clinical Events by Combining Static and Dynamic Information Using Recurrent Neural Networks Cristóbal Esteban 1, Oliver Staeck 2, Yinchong Yang 1 and Volker Tresp 1 1 Siemens AG and Ludwig Maximilian
More informationSequential Predictions Recurrent Neural Networks
CS 2770: Computer Vision Sequential Predictions Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 One Motivation: Descriptive Text for Images It was an arresting
More informationarxiv: v1 [cs.cv] 19 Jan 2018
Describing Semantic Representations of Brain Activity Evoked by Visual Stimuli arxiv:1802.02210v1 [cs.cv] 19 Jan 2018 Eri Matsuo Ichiro Kobayashi Ochanomizu University 2-1-1 Ohtsuka, Bunkyo-ku, Tokyo 112-8610,
More informationValidity and reliability of measurements
Validity and reliability of measurements 2 3 Request: Intention to treat Intention to treat and per protocol dealing with cross-overs (ref Hulley 2013) For example: Patients who did not take/get the medication
More information