Image Representations and New Domains in Neural Image Captioning. Jack Hessel, Nic Saava, Mike Wilber
|
|
- Phyllis Barker
- 6 years ago
- Views:
Transcription
1 Image Representations and New Domains in Neural Image Captioning Jack Hessel, Nic Saava, Mike Wilber
2 1 The caption generation problem...? A young girl runs through a grassy field.
3 1? A you throug
4 1 Image Model? Language Model A you throug
5 1? Image Model? Language Model A you throug
6 2 The world of neural caption generation models... [Karpathy and Li 2014] [Mao et al. 2014] [Vinyals et al. 2014] [Kiros et al. 2014] [Donahue et al. 2014] [Fang et al. 2014]
7 3 [Vinyals et al. 2014]
8 4 Neural Image Caption (Vinyal s et. al. 2014)? A young girl runs through a grassy field.
9 4 Neural Image Caption (Vinyal s et. al. 2014) A young girl runs through a grassy field.
10 4 Neural Image Caption (Vinyal s et. al. 2014) A young girl runs through a grassy field.
11 4 Neural Image Caption (Vinyal s et. al. 2014) A young girl runs through a grassy field. Convolutional Neural Network
12 4 Neural Image Caption (Vinyal s et. al. 2014) Fixed Representation A young girl runs through a grassy field. Fixed Convolutional Neural Network + Pretrained
13 4 Neural Image Caption (Vinyal s et. al. 2014) Fixed Representation A young girl A young girl runs through a grassy field. * Fixed Convolutional Neural Network + Pretrained A young
14 4 Neural Image Caption (Vinyal s et. al. 2014) Fixed Representation A young girl A young girl runs through a grassy field. * Fixed Convolutional Neural Network + Pretrained A young Long Short Term Memory Recurrent Neural Network [Hochreiter and Schmidhuber 1997]
15 5 Sutskever et al The meaning of life is the tradition of the ancient human reproduction: it is less favorable to the good boy for when to remove her bigger. In the show s agreement unanimously resurfaced. The wild pasteured with consistent street forests were incorporated by the 15th century BE...
16 5 Sutskever et al The meaning of life is the tradition of the ancient human reproduction: it is less favorable to the good boy for when to remove her bigger. In the show s agreement unanimously resurfaced. The wild pasteured with consistent street forests were incorporated by the 15th century BE...
17 5 Sutskever et al The meaning of life is the tradition of the ancient human reproduction: it is less favorable to the good boy for when to remove her bigger. In the show s agreement unanimously resurfaced. The wild pasteured with consistent street forests were incorporated by the 15th century BE...
18 6
19 6
20 6 Higher Quality Representations
21 6 Higher Quality Representations
22 6 Higher Quality Representations
23 6 Higher Quality Representations?? Higher Quality Captions??
24 7 Caption Quality Plausible outcomes... Higher Quality Representations
25 7 Caption Quality Plausible outcomes... Higher Quality Representations
26 7 Caption Quality Plausible outcomes... Higher Quality Representations
27 7 Caption Quality Plausible outcomes... Higher Quality Representations
28 8 Classification Accuracy Training Iterations
29 8 Classification Accuracy Training Iterations
30 8 Classification Accuracy Training Iterations
31 8 Less Blur Classification Accuracy Training Iterations
32 8 Less Blur Classification Accuracy Training Iterations
33 9 Directly learned: Flickr8k [Hodosh et al. 2013]
34 10
35 10
36 10
37 10
38 11 Malaysian Spicy Noodles 66K Image/recipe pairs, courtesy Yummly.com
39 12 Is this really a new language task?
40 12 Is this really a new language task?
41 12 Is this really a new language task? Less variety overall
42 12 Is this really a new language task? malaysian spicy noodles chicken curry with lime cheesy pizza bagels with mushrooms... Less variety overall
43 12 Is this really a new language task? malaysian spicy noodles chicken curry with lime cheesy pizza bagels with mushrooms... Less variety overall Shorter captions
44 13 Fixed Representation Fixed Convolutional Neural Network + Pretrained
45 13 =/=
46 14 Data from [Bossard et al. 2014] 101K Labeled Food Images in 101 Classes
47 Data from [Bossard et al. 2014] K Labeled Food Images in 101 Classes [Krizhevsky et al. 2012] 56.4% Rank-1 Accuracy
48 Data from [Bossard et al. 2014] K Labeled Food Images in 101 Classes Transfer Learning [Krizhevsky et al. 2012] 56.4% Rank-1 Accuracy
49 Data from [Bossard et al. 2014] K Labeled Food Images in 101 Classes Transfer Learning [Krizhevsky et al. 2012] 56.4% Rank-1 Accuracy 66.8% Rank-1 Accuracy
50 Data from [Bossard et al. 2014] K Labeled Food Images in 101 Classes Transfer Learning [Krizhevsky et al. 2012] 56.4% Rank-1 Accuracy [Krizhevsky et al. 2012] 66.8% Rank-1 Accuracy
51 Data from [Bossard et al. 2014] K Labeled Food Images in 101 Classes Transfer Learning [Krizhevsky et al. 2012] 56.4% Rank-1 Accuracy [Bossard et al. 2014] [Krizhevsky et al. 2012] 66.8% Rank-1 Accuracy
52 15 Transfer learned: Yummly
53 16 Caption Quality Plausible outcomes... Higher Quality Representations
54 16 Caption Quality Plausible outcomes... Higher Quality Representations
55 17 Implications? Image Model? Language Model
56 17 Implications Mostly Coarse-Grained Information Image Model? Language Model
57 17 Implications Mostly Coarse-Grained Information Image Model? Language Model [Fang et al. 2014]
58 18 Towards Fine-Grained Captioning
59 18 Towards Fine-Grained Captioning A young girl
60 18 Towards Fine-Grained Captioning A young girl
61 18 Towards Fine-Grained Captioning A young girl [Lin et al. 2014]
62 18 Towards Fine-Grained Captioning A young girl [Lin et al. 2014] Better Loss Functions!
63 19 A few more experiments in the paper...
64 19 A few more experiments in the paper... VGGNet [Simonyan and Zisserman 2014]
65 19 A few more experiments in the paper... VGGNet vs. [Simonyan and Zisserman 2014]
66 20 Acknowledgements Lillian Lee David Mimno Jason Yosinski Serge Belongie Xanda Schofield Gregory Druck Abby Lewis CS 6670 Students Serge David Jason Xanda Gregory
67 Image Representations and New Domains in Neural Image Captioning Jack Hessel, Nic Saava, Mike Wilber
68 New domain advice: one caption per image?
69 AlexNet vs VGG-Net
arxiv: v1 [stat.ml] 23 Jan 2017
Learning what to look in chest X-rays with a recurrent visual attention model arxiv:1701.06452v1 [stat.ml] 23 Jan 2017 Petros-Pavlos Ypsilantis Department of Biomedical Engineering King s College London
More informationAttention Correctness in Neural Image Captioning
Attention Correctness in Neural Image Captioning Chenxi Liu 1 Junhua Mao 2 Fei Sha 2,3 Alan Yuille 1,2 Johns Hopkins University 1 University of California, Los Angeles 2 University of Southern California
More informationTranslating Videos to Natural Language Using Deep Recurrent Neural Networks
Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan UT Austin Huijuan Xu UMass. Lowell Jeff Donahue UC Berkeley Marcus Rohrbach UC Berkeley Subhashini Venugopalan
More informationSequential Predictions Recurrent Neural Networks
CS 2770: Computer Vision Sequential Predictions Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 One Motivation: Descriptive Text for Images It was an arresting
More informationMulti-attention Guided Activation Propagation in CNNs
Multi-attention Guided Activation Propagation in CNNs Xiangteng He and Yuxin Peng (B) Institute of Computer Science and Technology, Peking University, Beijing, China pengyuxin@pku.edu.cn Abstract. CNNs
More informationRecurrent Neural Networks
CS 2750: Machine Learning Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2017 One Motivation: Descriptive Text for Images It was an arresting face, pointed of chin,
More informationLOOK, LISTEN, AND DECODE: MULTIMODAL SPEECH RECOGNITION WITH IMAGES. Felix Sun, David Harwath, and James Glass
LOOK, LISTEN, AND DECODE: MULTIMODAL SPEECH RECOGNITION WITH IMAGES Felix Sun, David Harwath, and James Glass MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA {felixsun,
More informationQuantifying Radiographic Knee Osteoarthritis Severity using Deep Convolutional Neural Networks
Quantifying Radiographic Knee Osteoarthritis Severity using Deep Convolutional Neural Networks Joseph Antony, Kevin McGuinness, Noel E O Connor, Kieran Moran Insight Centre for Data Analytics, Dublin City
More informationAutomated diagnosis of pneumothorax using an ensemble of convolutional neural networks with multi-sized chest radiography images
Automated diagnosis of pneumothorax using an ensemble of convolutional neural networks with multi-sized chest radiography images Tae Joon Jun, Dohyeun Kim, and Daeyoung Kim School of Computing, KAIST,
More informationarxiv: v2 [cs.cv] 22 Mar 2018
Deep saliency: What is learnt by a deep network about saliency? Sen He 1 Nicolas Pugeault 1 arxiv:1801.04261v2 [cs.cv] 22 Mar 2018 Abstract Deep convolutional neural networks have achieved impressive performance
More informationChittron: An Automatic Bangla Image Captioning System
Chittron: An Automatic Bangla Image Captioning System Motiur Rahman 1, Nabeel Mohammed 2, Nafees Mansoor 3 and Sifat Momen 4 1,3 Department of Computer Science and Engineering, University of Liberal Arts
More informationA convolutional neural network to classify American Sign Language fingerspelling from depth and colour images
A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images Ameen, SA and Vadera, S http://dx.doi.org/10.1111/exsy.12197 Title Authors Type URL A convolutional
More informationEfficient Deep Model Selection
Efficient Deep Model Selection Jose Alvarez Researcher Data61, CSIRO, Australia GTC, May 9 th 2017 www.josemalvarez.net conv1 conv2 conv3 conv4 conv5 conv6 conv7 conv8 softmax prediction???????? Num Classes
More informationVector Learning for Cross Domain Representations
Vector Learning for Cross Domain Representations Shagan Sah, Chi Zhang, Thang Nguyen, Dheeraj Kumar Peri, Ameya Shringi, Raymond Ptucha Rochester Institute of Technology, Rochester, NY 14623, USA arxiv:1809.10312v1
More informationImage Captioning using Reinforcement Learning. Presentation by: Samarth Gupta
Image Captioning using Reinforcement Learning Presentation by: Samarth Gupta 1 Introduction Summary Supervised Models Image captioning as RL problem Actor Critic Architecture Policy Gradient architecture
More informationCSE Introduction to High-Perfomance Deep Learning ImageNet & VGG. Jihyung Kil
CSE 5194.01 - Introduction to High-Perfomance Deep Learning ImageNet & VGG Jihyung Kil ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton,
More informationAutomatic Arabic Image Captioning using RNN-LSTM-Based Language Model and CNN
Automatic Arabic Image Captioning using RNN-LSTM-Based Language Model and CNN Huda A. Al-muzaini, Tasniem N. Al-yahya, Hafida Benhidour Dept. of Computer Science College of Computer and Information Sciences
More informationarxiv: v1 [cs.cv] 19 Jan 2018
Describing Semantic Representations of Brain Activity Evoked by Visual Stimuli arxiv:1802.02210v1 [cs.cv] 19 Jan 2018 Eri Matsuo Ichiro Kobayashi Ochanomizu University 2-1-1 Ohtsuka, Bunkyo-ku, Tokyo 112-8610,
More informationVision, Language, Reasoning
CS 2770: Computer Vision Vision, Language, Reasoning Prof. Adriana Kovashka University of Pittsburgh March 5, 2019 Plan for this lecture Image captioning Tool: Recurrent neural networks Captioning for
More informationB657: Final Project Report Holistically-Nested Edge Detection
B657: Final roject Report Holistically-Nested Edge Detection Mingze Xu & Hanfei Mei May 4, 2016 Abstract Holistically-Nested Edge Detection (HED), which is a novel edge detection method based on fully
More informationarxiv: v1 [cs.cv] 21 Nov 2018
Rethinking ImageNet Pre-training Kaiming He Ross Girshick Piotr Dollár Facebook AI Research (FAIR) arxiv:1811.8883v1 [cs.cv] 21 Nov 218 Abstract We report competitive results on object detection and instance
More informationMosquito Larva Classification based on a Convolution Neural Network
320 Int'l Conf. Par. and Dist. Proc. Tech. and Appl. PDPTA'18 Mosquito Larva Classification based on a Convolution Neural Network Alejandra Sanchez Ortiz 1, Mariko Nakano Miyatake 1, Henrik Tünnermann
More informationarxiv: v3 [cs.cv] 11 Aug 2017 Abstract
human G-GAN G-MLE human G-GAN G-MLE Towards Diverse and Natural Image Descriptions via a Conditional GAN Bo Dai 1 Sanja Fidler 23 Raquel Urtasun 234 Dahua Lin 1 1 Department of Information Engineering,
More informationarxiv: v1 [cs.cv] 12 Dec 2016
Text-guided Attention Model for Image Captioning Jonghwan Mun, Minsu Cho, Bohyung Han Department of Computer Science and Engineering, POSTECH, Korea {choco1916, mscho, bhhan}@postech.ac.kr arxiv:1612.03557v1
More informationDeepDiary: Automatically Captioning Lifelogging Image Streams
DeepDiary: Automatically Captioning Lifelogging Image Streams Chenyou Fan David J. Crandall School of Informatics and Computing Indiana University Bloomington, Indiana USA {fan6,djcran}@indiana.edu Abstract.
More informationSmaller, faster, deeper: University of Edinburgh MT submittion to WMT 2017
Smaller, faster, deeper: University of Edinburgh MT submittion to WMT 2017 Rico Sennrich, Alexandra Birch, Anna Currey, Ulrich Germann, Barry Haddow, Kenneth Heafield, Antonio Valerio Miceli Barone, Philip
More informationContext aware decision support in neurosurgical oncology based on an efficient classification of endomicroscopic data
International Journal of Computer Assisted Radiology and Surgery (2018) 13:1187 1199 https://doi.org/10.1007/s11548-018-1806-7 ORIGINAL ARTICLE Context aware decision support in neurosurgical oncology
More informationNetworks and Hierarchical Processing: Object Recognition in Human and Computer Vision
Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision Guest&Lecture:&Marius&Cătălin&Iordan&& CS&131&8&Computer&Vision:&Foundations&and&Applications& 01&December&2014 1.
More informationKeyword-driven Image Captioning via Context-dependent Bilateral LSTM
Keyword-driven Image Captioning via Context-dependent Bilateral LSTM Xiaodan Zhang 1,2, Shuqiang Jiang 2, Qixiang Ye 2, Jianbin Jiao 2, Rynson W.H. Lau 1 1 City University of Hong Kong 2 University of
More informationarxiv: v3 [cs.cv] 21 Mar 2017
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju 1 Michael Cogswell 2 Abhishek Das 2 Ramakrishna Vedantam 1 Devi Parikh 2 Dhruv Batra 2 1 Virginia
More informationarxiv: v1 [cs.cv] 28 Mar 2016
arxiv:1603.08507v1 [cs.cv] 28 Mar 2016 Generating Visual Explanations Lisa Anne Hendricks 1 Zeynep Akata 2 Marcus Rohrbach 1,3 Jeff Donahue 1 Bernt Schiele 2 Trevor Darrell 1 1 UC Berkeley EECS, CA, United
More informationGuided Open Vocabulary Image Captioning with Constrained Beam Search
Guided Open Vocabulary Image Captioning with Constrained Beam Search Peter Anderson 1, Basura Fernando 1, Mark Johnson 2, Stephen Gould 1 1 The Australian National University, Canberra, Australia firstname.lastname@anu.edu.au
More informationMultiple Granularity Descriptors for Fine-grained Categorization
Multiple Granularity Descriptors for Fine-grained Categorization Dequan Wang 1, Zhiqiang Shen 1, Jie Shao 1, Wei Zhang 1, Xiangyang Xue 1 and Zheng Zhang 2 1 Shanghai Key Laboratory of Intelligent Information
More informationarxiv: v1 [cs.cl] 13 Apr 2017
Room for improvement in automatic image description: an error analysis Emiel van Miltenburg Vrije Universiteit Amsterdam emiel.van.miltenburg@vu.nl Desmond Elliott ILLC, Universiteit van Amsterdam d.elliott@uva.nl
More informationGenerating Visual Explanations
Generating Visual Explanations Lisa Anne Hendricks 1 Zeynep Akata 2 Marcus Rohrbach 1,3 Jeff Donahue 1 Bernt Schiele 2 Trevor Darrell 1 1 UC Berkeley EECS, CA, United States 2 Max Planck Institute for
More informationarxiv: v1 [cs.cv] 25 Sep 2017
Fine-grained Discriminative Localization via Saliency-guided Faster R-CNN Xiangteng He, Yuxin Peng* and Junjie Zhao Institute of Computer Science and Technology, Peking University Beijing, China pengyuxin@pku.edu.cn
More informationGuided Open Vocabulary Image Captioning with Constrained Beam Search
Guided Open Vocabulary Image Captioning with Constrained Beam Search Peter Anderson 1, Basura Fernando 1, Mark Johnson 2, Stephen Gould 1 1 The Australian National University, Canberra, Australia firstname.lastname@anu.edu.au
More informationarxiv: v2 [cs.cv] 19 Jul 2017
Guided Open Vocabulary Image Captioning with Constrained Beam Search Peter Anderson 1, Basura Fernando 1, Mark Johnson 2, Stephen Gould 1 1 The Australian National University, Canberra, Australia firstname.lastname@anu.edu.au
More informationarxiv: v2 [cs.cv] 10 Apr 2017
A Hierarchical Approach for Generating Descriptive Image Paragraphs Jonathan Krause Justin Johnson Ranjay Krishna Li Fei-Fei Stanford University {jkrause,jcjohns,ranjaykrishna,feifeili}@cs.stanford.edu
More informationGrad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization Ramprasaath R. Selvaraju 1 Michael Cogswell 1 Abhishek Das 1 Ramakrishna Vedantam 1 Devi Parikh 1,2 Dhruv Batra 1,2 1 Georgia
More informationDiscriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries
Discriminative Bimodal Networks for Visual Localization and Detection with Natural Language Queries Yuting Zhang, Luyao Yuan, Yijie Guo, Zhiyuan He, I-An Huang, Honglak Lee University of Michigan, Ann
More informationExpert identification of visual primitives used by CNNs during mammogram classification
Expert identification of visual primitives used by CNNs during mammogram classification Jimmy Wu a, Diondra Peck b, Scott Hsieh c, Vandana Dialani, MD d, Constance D. Lehman, MD e, Bolei Zhou a, Vasilis
More informationDeep Supervision for Pancreatic Cyst Segmentation in Abdominal CT Scans
Deep Supervision for Pancreatic Cyst Segmentation in Abdominal CT Scans Yuyin Zhou 1, Lingxi Xie 2( ), Elliot K. Fishman 3, Alan L. Yuille 4 1,2,4 The Johns Hopkins University, Baltimore, MD 21218, USA
More informationarxiv: v1 [cs.cv] 13 Mar 2018
RESOURCE AWARE DESIGN OF A DEEP CONVOLUTIONAL-RECURRENT NEURAL NETWORK FOR SPEECH RECOGNITION THROUGH AUDIO-VISUAL SENSOR FUSION Matthijs Van keirsbilck Bert Moons Marian Verhelst MICAS, Department of
More informationTHE human visual system has an exceptional ability of
SUBMITTED MANUSCRIPT TO IEEE TCDS 1 DeepFeat: A Bottom Up and Top Down Saliency Model Based on Deep Features of Convolutional Neural Nets Ali Mahdi, Student Member, IEEE, and Jun Qin, Member, IEEE arxiv:1709.02495v1
More informationObject Detectors Emerge in Deep Scene CNNs
Object Detectors Emerge in Deep Scene CNNs Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba Presented By: Collin McCarthy Goal: Understand how objects are represented in CNNs Are
More informationLearning a Discriminative Filter Bank within a CNN for Fine-grained Recognition
Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition Yaming Wang 1, Vlad I. Morariu 2, Larry S. Davis 1 1 University of Maryland, College Park 2 Adobe Research {wym, lsd}@umiacs.umd.edu
More informationLearned Region Sparsity and Diversity Also Predict Visual Attention
Learned Region Sparsity and Diversity Also Predict Visual Attention Zijun Wei 1, Hossein Adeli 2, Gregory Zelinsky 1,2, Minh Hoai 1, Dimitris Samaras 1 1. Department of Computer Science 2. Department of
More informationDeep Retinal Image Understanding
Deep Retinal Image Understanding Kevis-Kokitsi Maninis, Jordi Pont-Tuset, Pablo Arbeláez 2, and Luc Van Gool,3 ETH Zürich 2 Universidad de los Andes 3 KU Leuven Abstract. This paper presents Deep Retinal
More informationKnowledge-based Recurrent Attentive Neural Network for Small Object Detection
Knowledge-based Recurrent Attentive Neural Network for Small Object Detection Kai Yi,, Zhiqiang Jian,, Shitao Chen, Yuedong Yang, Nanning Zheng Institute of Artificial Intelligence and Robotics, Xi an
More informationLSTD: A Low-Shot Transfer Detector for Object Detection
LSTD: A Low-Shot Transfer Detector for Object Detection Hao Chen 1,2, Yali Wang 1, Guoyou Wang 2, Yu Qiao 1,3 1 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, China 2 Huazhong
More informationAggregated Sparse Attention for Steering Angle Prediction
Aggregated Sparse Attention for Steering Angle Prediction Sen He, Dmitry Kangin, Yang Mi and Nicolas Pugeault Department of Computer Sciences,University of Exeter, Exeter, EX4 4QF Email: {sh752, D.Kangin,
More informationMulti-Level Net: a Visual Saliency Prediction Model
Multi-Level Net: a Visual Saliency Prediction Model Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, Rita Cucchiara Department of Engineering Enzo Ferrari, University of Modena and Reggio Emilia {name.surname}@unimore.it
More informationDeep Learning Human Mind for Automated Visual Classification
Deep Learning Human Mind for Automated Visual Classification C. Spampinato, S. Palazzo, I. Kavasidis, D. Giordano Department of Electrical, Electronics and Computer Engineering - PeRCeiVe Lab Viale Andrea
More informationDeep Learning Human Mind for Automated Visual Classification
Deep Learning Human Mind for Automated Visual Classification C. Spampinato, S. Palazzo, I. Kavasidis, D. Giordano Department of Electrical, Electronics and Computer Engineering - PeRCeiVe Lab Viale Andrea
More informationarxiv: v3 [cs.cv] 11 Apr 2015
ATTENTION FOR FINE-GRAINED CATEGORIZATION Pierre Sermanet, Andrea Frome, Esteban Real Google, Inc. {sermanet,afrome,ereal,}@google.com ABSTRACT arxiv:1412.7054v3 [cs.cv] 11 Apr 2015 This paper presents
More informationA Weighted Combination of Text and Image Classifiers for User Gender Inference
A Weighted Combination of Text and Image Classifiers for User Gender Inference Tomoki Taniguchi, Shigeyuki Sakaki, Ryosuke Shigenaka, Yukihiro Tsuboshita and Tomoko Ohkuma Fuji Xerox Co., Ltd., 6-1, Minatomirai,
More informationVisual semantics: image elements. Symbols Objects People Poses
Visible Partisanship Polmeth XXXIII, Rice University, July 22, 2016 Convolutional Neural Networks for the Analysis of Political Images L. Jason Anastasopoulos ljanastas@uga.edu (University of Georgia,
More informationA Deep Multi-Task Learning Approach to Skin Lesion Classification
The AAAI-17 Joint Workshop on Health Intelligence WS-17-09 A Deep Multi-Task Learning Approach to Skin Lesion Classification Haofu Liao, Jiebo Luo Department of Computer Science, University of Rochester
More informationDeep Neural Networks: A New Framework for Modeling Biological Vision and Brain Information Processing
ANNUAL REVIEWS Further Click here to view this article's online features: Download figures as PPT slides Navigate linked references Download citations Explore related articles Search keywords Annu. Rev.
More informationHow Deep is the Feature Analysis underlying Rapid Visual Categorization?
How Deep is the Feature Analysis underlying Rapid Visual Categorization? Sven Eberhardt Jonah Cader Thomas Serre Department of Cognitive Linguistic & Psychological Sciences Brown Institute for Brain Sciences
More informationarxiv: v2 [cs.cv] 23 Oct 2016
Do They All Look the Same? Deciphering Chinese, Japanese and Koreans by Fine-Grained Deep Learning Yu Wang, Haofu Liao, Yang Feng, Xiangyang Xu, Jiebo Luo* arxiv:1610.01854v2 [cs.cv] 23 Oct 2016 Department
More informationDEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS
SEOUL Oct.7, 2016 DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS Gunhee Kim Computer Science and Engineering Seoul National University October
More informationarxiv: v1 [cs.cv] 1 Aug 2016
A conference version of this paper is accepted at ECCV 16 for oral presentation Top-down Neural Attention by Excitation Backprop arxiv:1608.00507v1 [cs.cv] 1 Aug 2016 Jianming Zhang 1,2, Zhe Lin 2, Jonathan
More informationConvolutional Neural Networks (CNN)
Convolutional Neural Networks (CNN) Algorithm and Some Applications in Computer Vision Luo Hengliang Institute of Automation June 10, 2014 Luo Hengliang (Institute of Automation) Convolutional Neural Networks
More informationOn the Automatic Generation of Medical Imaging Reports
On the Automatic Generation of Medical Imaging Reports Baoyu Jing * Pengtao Xie * Eric P. Xing Petuum Inc, USA * School of Computer Science, Carnegie Mellon University, USA {baoyu.jing, pengtao.xie, eric.xing}@petuum.com
More informationImage Surveillance Assistant
Image Surveillance Assistant Michael Maynord Computer Science Dept. University of Maryland College Park, MD 20742 maynord@umd.edu Sambit Bhattacharya Dept. of Math & Computer Science Fayetteville State
More informationDEEP convolutional neural networks have gained much
Real-time emotion recognition for gaming using deep convolutional network features Sébastien Ouellet arxiv:8.37v [cs.cv] Aug 2 Abstract The goal of the present study is to explore the application of deep
More informationarxiv: v3 [cs.ne] 6 Jun 2016
Synthesizing the preferred inputs for neurons in neural networks via deep generator networks Anh Nguyen anguyen8@uwyo.edu Alexey Dosovitskiy dosovits@cs.uni-freiburg.de arxiv:1605.09304v3 [cs.ne] 6 Jun
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 12: Time Sequence Data, Recurrent Neural Networks (RNNs), Long Short-Term Memories (s), and Image Captioning Peter Belhumeur Computer Science Columbia University
More informationWhere should saliency models look next?
Where should saliency models look next? Zoya Bylinskii 1, Adrià Recasens 1, Ali Borji 2, Aude Oliva 1, Antonio Torralba 1, and Frédo Durand 1 1 Computer Science and Artificial Intelligence Laboratory Massachusetts
More informationHierarchical Convolutional Features for Visual Tracking
Hierarchical Convolutional Features for Visual Tracking Chao Ma Jia-Bin Huang Xiaokang Yang Ming-Husan Yang SJTU UIUC SJTU UC Merced ICCV 2015 Background Given the initial state (position and scale), estimate
More informationReal Time Image Saliency for Black Box Classifiers
Real Time Image Saliency for Black Box Classifiers Piotr Dabkowski pd437@cam.ac.uk University of Cambridge Yarin Gal yarin.gal@eng.cam.ac.uk University of Cambridge and Alan Turing Institute, London Abstract
More informationConvolutional and LSTM Neural Networks
Convolutional and LSTM Neural Networks Vanessa Jurtz January 11, 2017 Contents Neural networks and GPUs Lasagne Peptide binding to MHC class II molecules Convolutional Neural Networks (CNN) Recurrent and
More informationTop-down Neural Attention by Excitation Backprop
Top-down Neural Attention by Excitation Backprop Jianming Zhang 1,2 Zhe Lin 2 Jonathan Brandt 2 Xiaohui Shen 2 Stan Sclaroff 1 1 Boston University 2 Adobe Research {jmzhang,sclaroff}@bu.edu {zlin,jbrandt,xshen}@adobe.com
More informationarxiv: v2 [cs.cv] 24 Jul 2018
Facial Dynamics Interpreter Network: What are the Important Relations between Local Dynamics for Facial Trait Estimation? Seong Tae Kim and Yong Man Ro* arxiv:1711.10688v2 [cs.cv] 24 Jul 2018 School of
More informationWhen Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment
When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment Honglin Zheng1, Tianlang Chen2, Jiebo Luo3 Department of Computer Science University of Rochester, Rochester,
More informationSemantic Image Retrieval via Active Grounding of Visual Situations
Semantic Image Retrieval via Active Grounding of Visual Situations Max H. Quinn 1, Erik Conser 1, Jordan M. Witte 1, and Melanie Mitchell 1,2 1 Portland State University 2 Santa Fe Institute Email: quinn.max@gmail.com
More informationImage-Based Estimation of Real Food Size for Accurate Food Calorie Estimation
Image-Based Estimation of Real Food Size for Accurate Food Calorie Estimation Takumi Ege, Yoshikazu Ando, Ryosuke Tanno, Wataru Shimoda and Keiji Yanai Department of Informatics, The University of Electro-Communications,
More informationDeep Compression and EIE: Efficient Inference Engine on Compressed Deep Neural Network
Deep Compression and EIE: Efficient Inference Engine on Deep Neural Network Song Han*, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark Horowitz, Bill Dally Stanford University Our Prior Work: Deep
More informationarxiv: v1 [cs.cv] 23 May 2018
Excitation Dropout: Encouraging Plasticity in Deep Neural Networks Andrea Zunino 1 Sarah Adel Bargal 2 Pietro Morerio 1 Jianming Zhang 3 arxiv:1805.09092v1 [cs.cv] 23 May 2018 Stan Sclaroff 2 Vittorio
More informationShu Kong. Department of Computer Science, UC Irvine
Ubiquitous Fine-Grained Computer Vision Shu Kong Department of Computer Science, UC Irvine Outline 1. Problem definition 2. Instantiation 3. Challenge 4. Fine-grained classification with holistic representation
More informationHHS Public Access Author manuscript Med Image Comput Comput Assist Interv. Author manuscript; available in PMC 2018 January 04.
Discriminative Localization in CNNs for Weakly-Supervised Segmentation of Pulmonary Nodules Xinyang Feng 1, Jie Yang 1, Andrew F. Laine 1, and Elsa D. Angelini 1,2 1 Department of Biomedical Engineering,
More informationFine-grained Flower and Fungi Classification at CMP
Fine-grained Flower and Fungi Classification at CMP Milan Šulc 1, Lukáš Picek 2, Jiří Matas 1 1 Visual Recognition Group, Center for Machine Perception, Dept. of Cybernetics, FEE Czech Technical University
More informationU.N. IN ACTION. Release Date: May 2007 Programme No Duration: 4 19 Languages: English, French, Spanish and Russian
1 Release Date: May 2007 Programme No. 1074 Duration: 4 19 Languages: English, French, Spanish and Russian U.N. IN ACTION PROTECTING THE OWNERSHIP OF INDIGENOUS KNOWLEDGE IN INDIA VIDEO FOREST/PLANT AUDIO
More informationImage-Based Food Calorie Estimation Using Knowledge on Food Categories, Ingredients and Cooking Directions
Image-Based Food Calorie Estimation Using Knowledge on Food Categories, Ingredients and Cooking Directions Takumi Ege and Keiji Yanai Department of Informatics, The University of Electro-Communications,
More informationFacial Expression Classification Using Convolutional Neural Network and Support Vector Machine
Facial Expression Classification Using Convolutional Neural Network and Support Vector Machine Valfredo Pilla Jr, André Zanellato, Cristian Bortolini, Humberto R. Gamba and Gustavo Benvenutti Borba Graduate
More informationNeural Baby Talk. Jiasen Lu 1 Jianwei Yang 1 Dhruv Batra 1,2 Devi Parikh 1,2 1 Georgia Institute of Technology 2 Facebook AI Research.
Neural Baby Talk Jiasen Lu 1 Jianwei Yang 1 Dhruv Batra 1,2 Devi Parikh 1,2 1 Georgia Institute of Technology 2 Facebook AI Research {jiasenlu, jw2yang, dbatra, parikh}@gatech.edu Abstract We introduce
More informationarxiv: v1 [q-bio.nc] 27 Nov 2016
Predicting the Category and Attributes of Mental Pictures Using Deep Gaze Pooling arxiv:1611.10162v1 [q-bio.nc] 27 Nov 2016 Hosnieh Sattar 1,2 Andreas Bulling 1 Mario Fritz 2 1 Perceptual User Interfaces
More informationarxiv: v1 [cs.cv] 5 Dec 2014
Deep Neural Networks are Easily Fooled: High Confidence Predictions for Unrecognizable Images Anh Nguyen University of Wyoming anguyen8@uwyo.edu Jason Yosinski Cornell University yosinski@cs.cornell.edu
More informationShu Kong. Department of Computer Science, UC Irvine
Ubiquitous Fine-Grained Computer Vision Shu Kong Department of Computer Science, UC Irvine Outline 1. Problem definition 2. Instantiation 3. Challenge and philosophy 4. Fine-grained classification with
More informationSocial Image Captioning: Exploring Visual Attention and User Attention
sensors Article Social Image Captioning: Exploring and User Leiquan Wang 1 ID, Xiaoliang Chu 1, Weishan Zhang 1, Yiwei Wei 1, Weichen Sun 2,3 and Chunlei Wu 1, * 1 College of Computer & Communication Engineering,
More informationarxiv: v2 [cs.cv] 8 Sep 2016
arxiv:1608.07068v2 [cs.cv] 8 Sep 2016 Title Generation for User Generated Videos Kuo-Hao Zeng 1, Tseng-Hung Chen 1, Juan Carlos Niebles 2, Min Sun 1 1 Departmant of Electrical Engineering, National Tsing
More informationThe Goods on Nutrition: Eat to Live
The Goods on Nutrition: Eat to Live 2 Lesson 1: The Goods on Nutrition Vocabulary list Reference Notes alternative n. something available in place of another. anemia n. a blood disorder characterized by
More informationRecognizing American Sign Language Gestures from within Continuous Videos
Recognizing American Sign Language Gestures from within Continuous Videos Yuancheng Ye 1, Yingli Tian 1,2,, Matt Huenerfauth 3, and Jingya Liu 2 1 The Graduate Center, City University of New York, NY,
More informationArtificial Neural Networks
Artificial Neural Networks Torsten Reil torsten.reil@zoo.ox.ac.uk Outline What are Neural Networks? Biological Neural Networks ANN The basics Feed forward net Training Example Voice recognition Applications
More informationarxiv: v2 [cs.cv] 10 Aug 2017
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering arxiv:1707.07998v2 [cs.cv] 10 Aug 2017 Peter Anderson 1, Xiaodong He 2, Chris Buehler 2, Damien Teney 3 Mark Johnson
More informationarxiv: v3 [cs.cv] 12 Feb 2017
Published as a conference paper at ICLR 17 PAYING MORE ATTENTION TO ATTENTION: IMPROVING THE PERFORMANCE OF CONVOLUTIONAL NEURAL NETWORKS VIA ATTENTION TRANSFER Sergey Zagoruyko, Nikos Komodakis Université
More information