The University of Tokyo, NVAIL Partner Yoshitaka Ushiku

Size: px
Start display at page:

Download "The University of Tokyo, NVAIL Partner Yoshitaka Ushiku"

Transcription

1 Recognize, Describe, and Generate: Introduction of Recent Work at MIL The University of Tokyo, NVAIL Partner Yoshitaka Ushiku

2 MIL: Machine Intelligence Laboratory Beyond Human Intelligence Based on Cyber-Physical Systems Members One Professor (Prof. Harada) One Lecturer (me) One Assistant Professor One Postdoc Two Office Administrators 11 Ph. D. students 23 Master students 8 Bachelor students 5 Interns Varying research topics ICCV, CVPR, ECCV, ICML, NIPS, ICASSP, SIGdial, ACM Multimedia, ICME, ICRA, IROS, etc. The most important thing We are hiring!

3 Journalist Robot Born in 2006 Objective: publishing news automatically Recognize Objects, people, actions Describe What is happening Generate Contents as humans do

4 Outline Journalist Robot: ancestor of current work in MIL Outline: research originates with this robot Recognize Basic: Framework for DL, Domain Adaptation Classification: Single-modality, Multi-modalities Describe Image Captioning Video Captioning Generate Image Reconstruction Video Generation

5 Recognize

6 MILJS: JavaScript Deep Learning [Hidaka+, ICLR Workshop 2017]

7 MILJS: JavaScript Deep Learning Support for both learning and inference Support for nodes with GPGPUs Currently WebCL is utilized. Now working on WebGPU. Support for nodes w/o GPGPUs No requirements to install any software Even ResNet with 152 layers can be trained [Hidaka+, ICLR Workshop 2017] Let me show you a preliminary demonstration using mnist!

8 Asymmetric Tri-training for Domain Adaptation Unsupervised domain adaptation Trained on mnist Works on SVHN? [Saito+, submitted to ICML 2017] Ground-truth labels are associated with source (mnist) However, there are no labels for target (SVHN)

9 Asymmetric Tri-training for Domain Adaptation [Saito+, submitted to ICML 2017] Asymmetric Tri-training: pseudo labels for target domain

10 Asymmetric Tri-training for Domain Adaptation [Saito+, submitted to ICML 2017] 1 st : Training on MNIST Add pseudo labels for easy samples eight nine 2 nd ~: Training on MNIST+α Add more pseudo labels

11 End-to-end learning for environmental sound classification Existing methods for speech / sound recognition: 1 Feature extraction: Fourier Transformation (log-mel features) 2 Classification: CNN with the extracted feature map [Tokozume+, ICASSP 2017] 1 2 Log-mel features are suitable for human speech; but for environmental sounds?

12 End-to-end learning for environmental sound classification Proposed approach (EnvNet): CNN for both 1 feature map extraction and 2 classification [Tokozume+, ICASSP 2017] 1 2 Extracted feature map

13 End-to-end learning for environmental sound classification [Tokozume+, ICASSP 2017] Comparison of accuracy [%] on ESC-50 [Piczak, ACM MM 2015] log-mel feature + CNN [Piczak, MLSP 2015] End-to-end CNN (Ours) End-to-end CNN & log-mel feature + CNN (Ours) EnvNet can extract discriminative features for environmental sounds

14 Visual Question Answering (VQA) Question answering system for Associated image Question by natural language [Saito+, ICME 2017] Q: Is it going to rain soon? Ground Truth A: yes Q: Why is there snow on one side of the stream and clear grass on the other? Ground Truth A: shade

15 Visual Question Answering (VQA) Image VQA = Multi-class classification Image feature [Saito+, ICME 2017] Integrated vector Question feature Answer bed sheets, pillow Question What objects are found on the bed? After integrating for : usual classification

16 Visual Question Answering [Saito+, ICME 2017] Current advancement: improving how to integrate and Concatenation e.g.) [Antol+, ICCV 2015] Summation e.g.) Image feature (with attention) + Question feature [Xu+Saenko, ECCV 2016] Multiplication e.g.) Bilinear multiplication [Fukui+, EMNLP 2016] This work: DualNet doing sum, multiply and concatenation

17 Visual Question Answering (VQA) [Saito+, ICME 2017] VQA Challenge 2016 (in CVPR 2016) Won the 1 st place on abstract images w/o attention mechanism Q: What fruit is yellow and brown? A: banana Q: How many screens are there? A: 2 Q: What is the boy playing with? A: teddy bear Q: Are there any animals swimming in the pond? A: no

18 Describe

19 Automatic Image Captioning [Ushiku+, ACMMM 2011]

20 Training Dataset A small white dog wearing a flannel warmer. A white van parked in an empty lot. A small gray dog on a leash. A white cat rests head on a stone. A small white dog standing on a leash. Nearest Captions A black dog White and gray standing in a kitten Input lying Image on A small grassy white area. dog wearing a flannel warmer. its side. A small white dog wearing a flannel warmer. A small gray dog on a leash. Silver A small car parked gray dog on a leash. A woman posing on side of road. on a red scooter. A black dog standing in a grassy area. A black dog standing in a grassy area.

21 Automatic Image Captioning [ACM MM 2012, ICCV 2015] Group of people sitting at a table with a dinner. Tourists are standing on the middle of a flat desert.

22 Image Captioning + Sentiment Terms [Andrew+, BMVC 2016] A confused man in a blue shirt is sitting on a bench. A man in a blue shirt and blue jeans is standing in the overlooked water. A zebra standing in a field with a tree in the dirty background.

23 Image Captioning + Sentiment Terms Two steps for adding a sentiment term 1. Usual image captioning using CNN+RNN The most probable noun is memorized [Andrew+, BMVC 2016]

24 Image Captioning + Sentiment Terms Two steps for adding a sentiment term 1. Usual image captioning using CNN+RNN 2. Forced to predict sentiment term before the noun [Andrew+, BMVC 2016]

25 Beyond Caption to Narrative [Andrew+, ICIP 2016] A man is holding a box of doughnuts. Then he and a woman are standing next each other. Then she is holding a plate of food.

26 Beyond Caption to Narrative [Andrew+, ICIP 2016] A man is holding a box of doughnuts. he and a woman are standing next each other. she is holding a plate of food. Narrative

27 Beyond Caption to Narrative [Andrew+, ICIP 2016] A boat is floating on the water near a mountain. And a man riding a wave on top of a surfboard. Then he on the surfboard in the water.

28 Generate

29 Image Reconstruction [Kato+, CVPR 2014] Traditional pipeline for image classification Extracting local descriptors Collecting descriptors Calculating Global feature Classifying images d 1 d 2 d3 d m d 2 d 1 d m d k p( d; θ) Camera d k d N d j d j d 3 Cat d N

30 Image Reconstruction [Kato+, CVPR 2014] d 1 d 2 d3 d m d 2 d 1 d m d k p( d; θ) Camera d k d N d j d j d 3 Cat d N Inversed problem: Image reconstruction from a label Pot

31 Image Reconstruction [Kato+, CVPR 2014] Pot Optimized arrangement using: Global location cost + Adjacency cost Other examples cat (bombay) camera grand piano gramophone headphone pyramid joshua tree wheel chair

32

33 Video Generation [Yamamoto+, ACMMM 2016] Image generation is still challenging Only successful for controlled settings: Human faces Birds Flowers Video generation is BEGAN [Berthelot+, 2017 Mar.] Additionally requiring temporal consistency Extremely challenging StackGAN [Zhang+, 2016 Dec.] [Vondrick+, NIPS 2016]

34 Video Generation [Yamamoto+, ACMMM 2016] This work: generating easy videos C3D (3D convolutional neural network) for conditional generation with an input label tempcae (temporal convolutional auto-encoder) for regularizing video to improve its naturalness

35 Video Generation [Yamamoto+, ACMMM 2016] Car runs to left Ours (C3D+tempCAE) Only C3D Rocket flies up Ours (C3D+tempCAE) Only C3D

36 Conclusion MIL: Machine Intelligence Laboratory Beyond Human Intelligence Based on Cyber-Physical Systems This talk introduces some of the current research Recognize Basic: Framework for DL, Domain Adaptation Classification: Single-modality, Multi-modalities Describe Image Captioning, Video Captioning Generate Image Reconstruction, Video Generation

Attentional Masking for Pre-trained Deep Networks

Attentional Masking for Pre-trained Deep Networks Attentional Masking for Pre-trained Deep Networks IROS 2017 Marcus Wallenberg and Per-Erik Forssén Computer Vision Laboratory Department of Electrical Engineering Linköping University 2014 2017 Per-Erik

More information

Hierarchical Convolutional Features for Visual Tracking

Hierarchical Convolutional Features for Visual Tracking Hierarchical Convolutional Features for Visual Tracking Chao Ma Jia-Bin Huang Xiaokang Yang Ming-Husan Yang SJTU UIUC SJTU UC Merced ICCV 2015 Background Given the initial state (position and scale), estimate

More information

Learning to Disambiguate by Asking Discriminative Questions Supplementary Material

Learning to Disambiguate by Asking Discriminative Questions Supplementary Material Learning to Disambiguate by Asking Discriminative Questions Supplementary Material Yining Li 1 Chen Huang 2 Xiaoou Tang 1 Chen Change Loy 1 1 Department of Information Engineering, The Chinese University

More information

DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS

DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS SEOUL Oct.7, 2016 DEEP LEARNING BASED VISION-TO-LANGUAGE APPLICATIONS: CAPTIONING OF PHOTO STREAMS, VIDEOS, AND ONLINE POSTS Gunhee Kim Computer Science and Engineering Seoul National University October

More information

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling AAAI -13 July 16, 2013 Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling Sheng-hua ZHONG 1, Yan LIU 1, Feifei REN 1,2, Jinghuan ZHANG 2, Tongwei REN 3 1 Department of

More information

Medical Image Analysis

Medical Image Analysis Medical Image Analysis 1 Co-trained convolutional neural networks for automated detection of prostate cancer in multiparametric MRI, 2017, Medical Image Analysis 2 Graph-based prostate extraction in t2-weighted

More information

Keyword-driven Image Captioning via Context-dependent Bilateral LSTM

Keyword-driven Image Captioning via Context-dependent Bilateral LSTM Keyword-driven Image Captioning via Context-dependent Bilateral LSTM Xiaodan Zhang 1,2, Shuqiang Jiang 2, Qixiang Ye 2, Jianbin Jiao 2, Rynson W.H. Lau 1 1 City University of Hong Kong 2 University of

More information

Latent Space Based Text Generation Using Attention Models

Latent Space Based Text Generation Using Attention Models Latent Space Based Text Generation Using Attention Models Jules Gagnon-Marchand Prepared for NLP Workshop for MILA Aug. 31, 2018 Introduction: Motivation Text Generation is important: Any AI based task

More information

Putting Context into. Vision. September 15, Derek Hoiem

Putting Context into. Vision. September 15, Derek Hoiem Putting Context into Vision Derek Hoiem September 15, 2004 Questions to Answer What is context? How is context used in human vision? How is context currently used in computer vision? Conclusions Context

More information

Interpreting Deep Neural Networks and their Predictions

Interpreting Deep Neural Networks and their Predictions Fraunhofer Image Processing Heinrich Hertz Institute Interpreting Deep Neural Networks and their Predictions Wojciech Samek ML Group, Fraunhofer HHI (joint work with S. Lapuschkin, A. Binder, G. Montavon,

More information

Action Recognition. Computer Vision Jia-Bin Huang, Virginia Tech. Many slides from D. Hoiem

Action Recognition. Computer Vision Jia-Bin Huang, Virginia Tech. Many slides from D. Hoiem Action Recognition Computer Vision Jia-Bin Huang, Virginia Tech Many slides from D. Hoiem This section: advanced topics Convolutional neural networks in vision Action recognition Vision and Language 3D

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 12: Time Sequence Data, Recurrent Neural Networks (RNNs), Long Short-Term Memories (s), and Image Captioning Peter Belhumeur Computer Science Columbia University

More information

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering SUPPLEMENTARY MATERIALS 1. Implementation Details 1.1. Bottom-Up Attention Model Our bottom-up attention Faster R-CNN

More information

Image Captioning using Reinforcement Learning. Presentation by: Samarth Gupta

Image Captioning using Reinforcement Learning. Presentation by: Samarth Gupta Image Captioning using Reinforcement Learning Presentation by: Samarth Gupta 1 Introduction Summary Supervised Models Image captioning as RL problem Actor Critic Architecture Policy Gradient architecture

More information

Improving the Interpretability of DEMUD on Image Data Sets

Improving the Interpretability of DEMUD on Image Data Sets Improving the Interpretability of DEMUD on Image Data Sets Jake Lee, Jet Propulsion Laboratory, California Institute of Technology & Columbia University, CS 19 Intern under Kiri Wagstaff Summer 2018 Government

More information

Segmentation of Cell Membrane and Nucleus by Improving Pix2pix

Segmentation of Cell Membrane and Nucleus by Improving Pix2pix Segmentation of Membrane and Nucleus by Improving Pix2pix Masaya Sato 1, Kazuhiro Hotta 1, Ayako Imanishi 2, Michiyuki Matsuda 2 and Kenta Terai 2 1 Meijo University, Siogamaguchi, Nagoya, Aichi, Japan

More information

Skin cancer reorganization and classification with deep neural network

Skin cancer reorganization and classification with deep neural network Skin cancer reorganization and classification with deep neural network Hao Chang 1 1. Department of Genetics, Yale University School of Medicine 2. Email: changhao86@gmail.com Abstract As one kind of skin

More information

Recurrent Neural Networks

Recurrent Neural Networks CS 2750: Machine Learning Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 14, 2017 One Motivation: Descriptive Text for Images It was an arresting face, pointed of chin,

More information

Translating Videos to Natural Language Using Deep Recurrent Neural Networks

Translating Videos to Natural Language Using Deep Recurrent Neural Networks Translating Videos to Natural Language Using Deep Recurrent Neural Networks Subhashini Venugopalan UT Austin Huijuan Xu UMass. Lowell Jeff Donahue UC Berkeley Marcus Rohrbach UC Berkeley Subhashini Venugopalan

More information

Sequential Predictions Recurrent Neural Networks

Sequential Predictions Recurrent Neural Networks CS 2770: Computer Vision Sequential Predictions Recurrent Neural Networks Prof. Adriana Kovashka University of Pittsburgh March 28, 2017 One Motivation: Descriptive Text for Images It was an arresting

More information

Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation

Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation Annotation and Retrieval System Using Confabulation Model for ImageCLEF2011 Photo Annotation Ryo Izawa, Naoki Motohashi, and Tomohiro Takagi Department of Computer Science Meiji University 1-1-1 Higashimita,

More information

CS221 / Autumn 2017 / Liang & Ermon. Lecture 19: Conclusion

CS221 / Autumn 2017 / Liang & Ermon. Lecture 19: Conclusion CS221 / Autumn 2017 / Liang & Ermon Lecture 19: Conclusion Outlook AI is everywhere: IT, transportation, manifacturing, etc. AI being used to make decisions for: education, credit, employment, advertising,

More information

Choose the correct answers. Circle the letters, please.

Choose the correct answers. Circle the letters, please. Choose the correct answers. Circle the letters, please. 1. My niece lives uncle s house. A) in B) on C) at D) 2. Mark riding every week. A) go B) is going C) goes D) is go 3... milk in the fridge? A) Are

More information

Shu Kong. Department of Computer Science, UC Irvine

Shu Kong. Department of Computer Science, UC Irvine Ubiquitous Fine-Grained Computer Vision Shu Kong Department of Computer Science, UC Irvine Outline 1. Problem definition 2. Instantiation 3. Challenge and philosophy 4. Fine-grained classification with

More information

arxiv: v2 [cs.cv] 19 Dec 2017

arxiv: v2 [cs.cv] 19 Dec 2017 An Ensemble of Deep Convolutional Neural Networks for Alzheimer s Disease Detection and Classification arxiv:1712.01675v2 [cs.cv] 19 Dec 2017 Jyoti Islam Department of Computer Science Georgia State University

More information

Deep Networks and Beyond. Alan Yuille Bloomberg Distinguished Professor Depts. Cognitive Science and Computer Science Johns Hopkins University

Deep Networks and Beyond. Alan Yuille Bloomberg Distinguished Professor Depts. Cognitive Science and Computer Science Johns Hopkins University Deep Networks and Beyond Alan Yuille Bloomberg Distinguished Professor Depts. Cognitive Science and Computer Science Johns Hopkins University Artificial Intelligence versus Human Intelligence Understanding

More information

Kai-Wei Chang UCLA. What It Takes to Control Societal Bias in Natural Language Processing. References:

Kai-Wei Chang UCLA. What It Takes to Control Societal Bias in Natural Language Processing. References: What It Takes to Control Societal Bias in Natural Language Processing Kai-Wei Chang UCLA References: http://kwchang.net Kai-Wei Chang (kwchang.net/talks/sp.html) 1 A father and son get in a car crash and

More information

FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES

FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES FEATURE EXTRACTION USING GAZE OF PARTICIPANTS FOR CLASSIFYING GENDER OF PEDESTRIANS IN IMAGES Riku Matsumoto, Hiroki Yoshimura, Masashi Nishiyama, and Yoshio Iwai Department of Information and Electronics,

More information

Efficient Deep Model Selection

Efficient Deep Model Selection Efficient Deep Model Selection Jose Alvarez Researcher Data61, CSIRO, Australia GTC, May 9 th 2017 www.josemalvarez.net conv1 conv2 conv3 conv4 conv5 conv6 conv7 conv8 softmax prediction???????? Num Classes

More information

Shu Kong. Department of Computer Science, UC Irvine

Shu Kong. Department of Computer Science, UC Irvine Ubiquitous Fine-Grained Computer Vision Shu Kong Department of Computer Science, UC Irvine Outline 1. Problem definition 2. Instantiation 3. Challenge 4. Fine-grained classification with holistic representation

More information

Artificial Intelligence to Enhance Radiology Image Interpretation

Artificial Intelligence to Enhance Radiology Image Interpretation Artificial Intelligence to Enhance Radiology Image Interpretation Curtis P. Langlotz, MD, PhD Professor of Radiology and Biomedical Informatics Associate Chair, Information Systems, Department of Radiology

More information

Group Behavior Analysis and Its Applications

Group Behavior Analysis and Its Applications Group Behavior Analysis and Its Applications CVPR 2015 Tutorial Lecturers: Hyun Soo Park (University of Pennsylvania) Wongun Choi (NEC America Laboratory) Schedule 08:30am-08:50am 08:50am-09:50am 09:50am-10:10am

More information

Differential Attention for Visual Question Answering

Differential Attention for Visual Question Answering Differential Attention for Visual Question Answering Badri Patro and Vinay P. Namboodiri IIT Kanpur { badri,vinaypn }@iitk.ac.in Abstract In this paper we aim to answer questions based on images when provided

More information

Beyond R-CNN detection: Learning to Merge Contextual Attribute

Beyond R-CNN detection: Learning to Merge Contextual Attribute Brain Unleashing Series - Beyond R-CNN detection: Learning to Merge Contextual Attribute Shu Kong CS, ICS, UCI 2015-1-29 Outline 1. RCNN is essentially doing classification, without considering contextual

More information

Deep Learning Models for Time Series Data Analysis with Applications to Health Care

Deep Learning Models for Time Series Data Analysis with Applications to Health Care Deep Learning Models for Time Series Data Analysis with Applications to Health Care Yan Liu Computer Science Department University of Southern California Email: yanliu@usc.edu Yan Liu (USC) Deep Health

More information

Noise-Robust Speech Recognition Technologies in Mobile Environments

Noise-Robust Speech Recognition Technologies in Mobile Environments Noise-Robust Speech Recognition echnologies in Mobile Environments Mobile environments are highly influenced by ambient noise, which may cause a significant deterioration of speech recognition performance.

More information

Introduction. Help your students learn how to learn!

Introduction. Help your students learn how to learn! Introduction Help your students learn how to learn! Lay a strong foundation in listening skills, the ability to follow directions, and in the ability to remember what one sees and hears important skills

More information

An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns

An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns 1. Introduction Vasily Morzhakov, Alexey Redozubov morzhakovva@gmail.com, galdrd@gmail.com Abstract Cortical

More information

Start ASL The Fun Way to Learn American Sign Language for free!

Start ASL The Fun Way to Learn American Sign Language for free! Start ASL The Fun Way to Learn American Sign Language for free! ASL 3 WORKBOOK Table of Contents Unit 1... 3 Conversation Review 1.1... 3 Role Shifting Practice 1.2... 3 Unit 2... 4 Role Shifting with

More information

Functional Elements and Networks in fmri

Functional Elements and Networks in fmri Functional Elements and Networks in fmri Jarkko Ylipaavalniemi 1, Eerika Savia 1,2, Ricardo Vigário 1 and Samuel Kaski 1,2 1- Helsinki University of Technology - Adaptive Informatics Research Centre 2-

More information

Comparison of Two Approaches for Direct Food Calorie Estimation

Comparison of Two Approaches for Direct Food Calorie Estimation Comparison of Two Approaches for Direct Food Calorie Estimation Takumi Ege and Keiji Yanai Department of Informatics, The University of Electro-Communications, Tokyo 1-5-1 Chofugaoka, Chofu-shi, Tokyo

More information

CS6501: Deep Learning for Visual Recognition. GenerativeAdversarial Networks (GANs)

CS6501: Deep Learning for Visual Recognition. GenerativeAdversarial Networks (GANs) CS6501: Deep Learning for Visual Recognition GenerativeAdversarial Networks (GANs) Today s Class Adversarial Examples Input Optimization Generative Adversarial Networks (GANs) Conditional GANs Style-Transfer

More information

Object Detectors Emerge in Deep Scene CNNs

Object Detectors Emerge in Deep Scene CNNs Object Detectors Emerge in Deep Scene CNNs Bolei Zhou, Aditya Khosla, Agata Lapedriza, Aude Oliva, Antonio Torralba Presented By: Collin McCarthy Goal: Understand how objects are represented in CNNs Are

More information

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience. Outline: Motivation. What s the attention mechanism? Soft attention vs. Hard attention. Attention in Machine translation. Attention in Image captioning. State-of-the-art. 1 Motivation: Attention: Focusing

More information

Flexible, High Performance Convolutional Neural Networks for Image Classification

Flexible, High Performance Convolutional Neural Networks for Image Classification Flexible, High Performance Convolutional Neural Networks for Image Classification Dan C. Cireşan, Ueli Meier, Jonathan Masci, Luca M. Gambardella, Jürgen Schmidhuber IDSIA, USI and SUPSI Manno-Lugano,

More information

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks Marc Assens 1, Kevin McGuinness 1, Xavier Giro-i-Nieto 2, and Noel E. O Connor 1 1 Insight Centre for Data Analytic, Dublin City

More information

Thales Foundation Cyprus P.O. Box 28959, CY2084 Acropolis, Nicosia, Cyprus. Level 3 4

Thales Foundation Cyprus P.O. Box 28959, CY2084 Acropolis, Nicosia, Cyprus. Level 3 4 Thales Foundation Cyprus P.O. Box 28959, CY2084 Acropolis, Nicosia, Cyprus Kangourou Linguistics English Competition 2017 Level 3 4 Date: 4 February 2017 Time: 10:00 11:15 Questions 1 10 = 3 points Questions

More information

Convolutional Neural Networks for Text Classification

Convolutional Neural Networks for Text Classification Convolutional Neural Networks for Text Classification Sebastian Sierra MindLab Research Group July 1, 2016 ebastian Sierra (MindLab Research Group) NLP Summer Class July 1, 2016 1 / 32 Outline 1 What is

More information

Do Now: Write a detailed account of what happened in the cartoon.

Do Now: Write a detailed account of what happened in the cartoon. Do Now: Write a detailed account of what happened in the cartoon. Tracking Our Mastery We will be tracking how much we learn throughout the year, so we can create goals. 1. Complete the mastery worksheet

More information

Deep Learning-based Detection of Periodic Abnormal Waves in ECG Data

Deep Learning-based Detection of Periodic Abnormal Waves in ECG Data , March 1-16, 2018, Hong Kong Deep Learning-based Detection of Periodic Abnormal Waves in ECG Data Kaiji Sugimoto, Saerom Lee, and Yoshifumi Okada Abstract Automatic detection of abnormal electrocardiogram

More information

arxiv: v1 [cs.cv] 12 Dec 2016

arxiv: v1 [cs.cv] 12 Dec 2016 Text-guided Attention Model for Image Captioning Jonghwan Mun, Minsu Cho, Bohyung Han Department of Computer Science and Engineering, POSTECH, Korea {choco1916, mscho, bhhan}@postech.ac.kr arxiv:1612.03557v1

More information

COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks

COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks Textbook, Sections 6.2.2, 6.3, 7.9, 7.11-7.13, 9.1-9.5 COMP9444 17s2 Convolutional Networks 1 Outline Geometry of Hidden Unit Activations

More information

Artificial Intelligence in Breast Imaging

Artificial Intelligence in Breast Imaging Artificial Intelligence in Breast Imaging Manisha Bahl, MD, MPH Director of Breast Imaging Fellowship Program, Massachusetts General Hospital Assistant Professor of Radiology, Harvard Medical School Outline

More information

Automatic Diagnosis of Ovarian Carcinomas via Sparse Multiresolution Tissue Representation

Automatic Diagnosis of Ovarian Carcinomas via Sparse Multiresolution Tissue Representation Automatic Diagnosis of Ovarian Carcinomas via Sparse Multiresolution Tissue Representation Aïcha BenTaieb, Hector Li-Chang, David Huntsman, Ghassan Hamarneh Medical Image Analysis Lab, Simon Fraser University,

More information

Vision, Language, Reasoning

Vision, Language, Reasoning CS 2770: Computer Vision Vision, Language, Reasoning Prof. Adriana Kovashka University of Pittsburgh March 5, 2019 Plan for this lecture Image captioning Tool: Recurrent neural networks Captioning for

More information

Learning to Rank Authenticity from Facial Activity Descriptors Otto von Guericke University, Magdeburg - Germany

Learning to Rank Authenticity from Facial Activity Descriptors Otto von Guericke University, Magdeburg - Germany Learning to Rank Authenticity from Facial s Otto von Guericke University, Magdeburg - Germany Frerk Saxen, Philipp Werner, Ayoub Al-Hamadi The Task Real or Fake? Dataset statistics Training set 40 Subjects

More information

Introduction to Deep Reinforcement Learning and Control

Introduction to Deep Reinforcement Learning and Control Carnegie Mellon School of Computer Science Deep Reinforcement Learning and Control Introduction to Deep Reinforcement Learning and Control Lecture 1, CMU 10703 Katerina Fragkiadaki Logistics 3 assignments

More information

1. Introduction 1.1. About the content

1. Introduction 1.1. About the content 1. Introduction 1.1. About the content At first, some background ideas are given and what the origins of neurocomputing and artificial neural networks were. Then we start from single neurons or computing

More information

1. Introduction 1.1. About the content. 1.2 On the origin and development of neurocomputing

1. Introduction 1.1. About the content. 1.2 On the origin and development of neurocomputing 1. Introduction 1.1. About the content At first, some background ideas are given and what the origins of neurocomputing and artificial neural networks were. Then we start from single neurons or computing

More information

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations

Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Deep Learning Analytics for Predicting Prognosis of Acute Myeloid Leukemia with Cytogenetics, Age, and Mutations Andy Nguyen, M.D., M.S. Medical Director, Hematopathology, Hematology and Coagulation Laboratory,

More information

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information

Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information Analysis of Emotion Recognition using Facial Expressions, Speech and Multimodal Information C. Busso, Z. Deng, S. Yildirim, M. Bulut, C. M. Lee, A. Kazemzadeh, S. Lee, U. Neumann, S. Narayanan Emotion

More information

Dual Path Network and Its Applications

Dual Path Network and Its Applications Learning and Vision Group (NUS), ILSVRC 2017 - CLS-LOC & DET tasks Dual Path Network and Its Applications National University of Singapore: Yunpeng Chen, Jianan Li, Huaxin Xiao, Jianshu Li, Xuecheng Nie,

More information

Towards The Deep Model: Understanding Visual Recognition Through Computational Models. Panqu Wang Dissertation Defense 03/23/2017

Towards The Deep Model: Understanding Visual Recognition Through Computational Models. Panqu Wang Dissertation Defense 03/23/2017 Towards The Deep Model: Understanding Visual Recognition Through Computational Models Panqu Wang Dissertation Defense 03/23/2017 Summary Human Visual Recognition (face, object, scene) Simulate Explain

More information

Accessorize to a Crime: Real and Stealthy Attacks on State-Of. Face Recognition. Keshav Yerra ( ) Monish Prasad ( )

Accessorize to a Crime: Real and Stealthy Attacks on State-Of. Face Recognition. Keshav Yerra ( ) Monish Prasad ( ) Accessorize to a Crime: Real and Stealthy Attacks on State-Of Of-The The-Art Face Recognition Keshav Yerra (2670843) Monish Prasad (2671587) Machine Learning is Everywhere Cancer Diagnosis Surveillance

More information

Multi-Modality American Sign Language Recognition

Multi-Modality American Sign Language Recognition Rochester Institute of Technology RIT Scholar Works Presentations and other scholarship 9-2016 Multi-Modality American Sign Language Recognition Chenyang Zhang City College of New York Yingli Tian City

More information

DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation

DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation DeepASL: Enabling Ubiquitous and Non-Intrusive Word and Sentence-Level Sign Language Translation Biyi Fang Michigan State University ACM SenSys 17 Nov 6 th, 2017 Biyi Fang (MSU) Jillian Co (MSU) Mi Zhang

More information

Local Image Structures and Optic Flow Estimation

Local Image Structures and Optic Flow Estimation Local Image Structures and Optic Flow Estimation Sinan KALKAN 1, Dirk Calow 2, Florentin Wörgötter 1, Markus Lappe 2 and Norbert Krüger 3 1 Computational Neuroscience, Uni. of Stirling, Scotland; {sinan,worgott}@cn.stir.ac.uk

More information

Clusters, Symbols and Cortical Topography

Clusters, Symbols and Cortical Topography Clusters, Symbols and Cortical Topography Lee Newman Thad Polk Dept. of Psychology Dept. Electrical Engineering & Computer Science University of Michigan 26th Soar Workshop May 26, 2006 Ann Arbor, MI agenda

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 15: Visual Attention Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 14 November 2017 1 / 28

More information

arxiv: v1 [cs.cv] 7 Dec 2018

arxiv: v1 [cs.cv] 7 Dec 2018 An Attempt towards Interpretable Audio-Visual Video Captioning Yapeng Tian 1, Chenxiao Guan 1, Justin Goodman 2, Marc Moore 3, and Chenliang Xu 1 arxiv:1812.02872v1 [cs.cv] 7 Dec 2018 1 Department of Computer

More information

Psychology of visual perception C O M M U N I C A T I O N D E S I G N, A N I M A T E D I M A G E 2014/2015

Psychology of visual perception C O M M U N I C A T I O N D E S I G N, A N I M A T E D I M A G E 2014/2015 Psychology of visual perception C O M M U N I C A T I O N D E S I G N, A N I M A T E D I M A G E 2014/2015 EXTENDED SUMMARY Lesson #4: Oct. 13 th 2014 Lecture plan: GESTALT PSYCHOLOGY Nature and fundamental

More information

Interpretable & Transparent Deep Learning

Interpretable & Transparent Deep Learning Fraunhofer Image Processing Heinrich Hertz Institute Interpretable & Transparent Deep Learning Fraunhofer HHI, Machine Learning Group Wojciech Samek Northern Lights Deep Learning Workshop (NLDL 19) Tromsø,

More information

1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1

1. INTRODUCTION. Vision based Multi-feature HGR Algorithms for HCI using ISL Page 1 1. INTRODUCTION Sign language interpretation is one of the HCI applications where hand gesture plays important role for communication. This chapter discusses sign language interpretation system with present

More information

Rich feature hierarchies for accurate object detection and semantic segmentation

Rich feature hierarchies for accurate object detection and semantic segmentation Rich feature hierarchies for accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik UC Berkeley Tech Report @ http://arxiv.org/abs/1311.2524! Detection

More information

KANGOUROU 2009-ENGLISH LEVEL 3-4

KANGOUROU 2009-ENGLISH LEVEL 3-4 The Jungle Book is the story of a boy, Mowgli, who grows up with the Wolf Family in the jungle. His Mother Wolf and Father Wolf protect him and he likes to play all day long with his brothers and sisters.

More information

Swadesh wordlist, categorised by semantic field.

Swadesh wordlist, categorised by semantic field. Swadesh wordlist, categorised by semantic field. 1. big adjective 2. long adjective 3. wide adjective 4. thick adjective 5. heavy adjective 6. small adjective 7. short adjective 8. narrow adjective 9.

More information

Neuro-Inspired Statistical. Rensselaer Polytechnic Institute National Science Foundation

Neuro-Inspired Statistical. Rensselaer Polytechnic Institute National Science Foundation Neuro-Inspired Statistical Pi Prior Model lfor Robust Visual Inference Qiang Ji Rensselaer Polytechnic Institute National Science Foundation 1 Status of Computer Vision CV has been an active area for over

More information

Discriminability objective for training descriptive image captions

Discriminability objective for training descriptive image captions Discriminability objective for training descriptive image captions Ruotian Luo TTI-Chicago Joint work with Brian Price Scott Cohen Greg Shakhnarovich (Adobe) (Adobe) (TTIC) Discriminability objective for

More information

Speech recognition in noisy environments: A survey

Speech recognition in noisy environments: A survey T-61.182 Robustness in Language and Speech Processing Speech recognition in noisy environments: A survey Yifan Gong presented by Tapani Raiko Feb 20, 2003 About the Paper Article published in Speech Communication

More information

A bandage helps to stop a cut getting dirty or infected. Give the name of one type of micro-organism which can infect a cut....

A bandage helps to stop a cut getting dirty or infected. Give the name of one type of micro-organism which can infect a cut.... Q1. Michael cut his knee while he was playing football. A first-aider put a bandage over the cut. (a) A bandage helps to stop a cut getting dirty or infected. Give the name of one type of micro-organism

More information

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian

The 29th Fuzzy System Symposium (Osaka, September 9-, 3) Color Feature Maps (BY, RG) Color Saliency Map Input Image (I) Linear Filtering and Gaussian The 29th Fuzzy System Symposium (Osaka, September 9-, 3) A Fuzzy Inference Method Based on Saliency Map for Prediction Mao Wang, Yoichiro Maeda 2, Yasutake Takahashi Graduate School of Engineering, University

More information

arxiv: v2 [cs.cv] 10 Apr 2017

arxiv: v2 [cs.cv] 10 Apr 2017 A Hierarchical Approach for Generating Descriptive Image Paragraphs Jonathan Krause Justin Johnson Ranjay Krishna Li Fei-Fei Stanford University {jkrause,jcjohns,ranjaykrishna,feifeili}@cs.stanford.edu

More information

B657: Final Project Report Holistically-Nested Edge Detection

B657: Final Project Report Holistically-Nested Edge Detection B657: Final roject Report Holistically-Nested Edge Detection Mingze Xu & Hanfei Mei May 4, 2016 Abstract Holistically-Nested Edge Detection (HED), which is a novel edge detection method based on fully

More information

arxiv: v1 [stat.ml] 23 Jan 2017

arxiv: v1 [stat.ml] 23 Jan 2017 Learning what to look in chest X-rays with a recurrent visual attention model arxiv:1701.06452v1 [stat.ml] 23 Jan 2017 Petros-Pavlos Ypsilantis Department of Biomedical Engineering King s College London

More information

A. Reading Comprehension 20 marks. Facts about Seals. Seals bark like a dog, have whiskers like a cat and swim like a fish.

A. Reading Comprehension 20 marks. Facts about Seals. Seals bark like a dog, have whiskers like a cat and swim like a fish. A. Reading Comprehension 20 marks A. 1. Read the following text carefully. Facts about Seals Seals bark like a dog, have whiskers like a cat and swim like a fish. Seals have special qualities that allow

More information

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning

Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning Yu-An Chung 1 Hsuan-Tien Lin 1 Shao-Wen Yang 2 1 Dept. of Computer Science and Information Engineering National Taiwan University, Taiwan

More information

University of Cambridge Engineering Part IB Information Engineering Elective

University of Cambridge Engineering Part IB Information Engineering Elective University of Cambridge Engineering Part IB Information Engineering Elective Paper 8: Image Searching and Modelling Using Machine Learning Handout 1: Introduction to Artificial Neural Networks Roberto

More information

Domain Adversarial Training for Accented Speech Recognition

Domain Adversarial Training for Accented Speech Recognition Domain Adversarial Training for Accented Speech Recognition Sining Sun [1-3], Ching-Feng Yeh [2], Mei-Yuh Hwang [2], Mari Ostendorf [3], Lei Xie [1] Northwestern Polytechnical University [1] Mobvoi AI

More information

EECS 433 Statistical Pattern Recognition

EECS 433 Statistical Pattern Recognition EECS 433 Statistical Pattern Recognition Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 19 Outline What is Pattern

More information

Computational modeling of visual attention and saliency in the Smart Playroom

Computational modeling of visual attention and saliency in the Smart Playroom Computational modeling of visual attention and saliency in the Smart Playroom Andrew Jones Department of Computer Science, Brown University Abstract The two canonical modes of human visual attention bottomup

More information

Simultaneous Estimation of Food Categories and Calories with Multi-task CNN

Simultaneous Estimation of Food Categories and Calories with Multi-task CNN Simultaneous Estimation of Food Categories and Calories with Multi-task CNN Takumi Ege and Keiji Yanai The University of Electro-Communications, Tokyo 1 Introduction (1) Spread of meal management applications.

More information

Intelligent Machines That Act Rationally. Hang Li Toutiao AI Lab

Intelligent Machines That Act Rationally. Hang Li Toutiao AI Lab Intelligent Machines That Act Rationally Hang Li Toutiao AI Lab Four Definitions of Artificial Intelligence Building intelligent machines (i.e., intelligent computers) Thinking humanly Acting humanly Thinking

More information

Human Activities: Handling Uncertainties Using Fuzzy Time Intervals

Human Activities: Handling Uncertainties Using Fuzzy Time Intervals The 19th International Conference on Pattern Recognition (ICPR), Tampa, FL, 2009 Human Activities: Handling Uncertainties Using Fuzzy Time Intervals M. S. Ryoo 1,2 and J. K. Aggarwal 1 1 Computer & Vision

More information

Travel Time-dependent Maximum Entropy Inverse Reinforcement Learning for Seabird Trajectory Prediction

Travel Time-dependent Maximum Entropy Inverse Reinforcement Learning for Seabird Trajectory Prediction Travel Time-dependent Maximum Entropy Inverse Reinforcement Learning for Seabird Trajectory Prediction Tsubasa Hirakawa, Takayoshi Yamashita, Ken Yoda, Toru Tamaki, Hironobu Fujiyoshi Chubu University

More information

Member 1 Member 2 Member 3 Member 4 Full Name Krithee Sirisith Pichai Sodsai Thanasunn

Member 1 Member 2 Member 3 Member 4 Full Name Krithee Sirisith Pichai Sodsai Thanasunn Microsoft Imagine Cup 2010 Thailand Software Design Round 1 Project Proposal Template PROJECT PROPOSAL DUE: 31 Jan 2010 To Submit to proposal: Register at www.imaginecup.com; select to compete in Software

More information

Weakly Supervised Coupled Networks for Visual Sentiment Analysis

Weakly Supervised Coupled Networks for Visual Sentiment Analysis Weakly Supervised Coupled Networks for Visual Sentiment Analysis Jufeng Yang, Dongyu She,Yu-KunLai,PaulL.Rosin, Ming-Hsuan Yang College of Computer and Control Engineering, Nankai University, Tianjin,

More information

Social Group Discovery from Surveillance Videos: A Data-Driven Approach with Attention-Based Cues

Social Group Discovery from Surveillance Videos: A Data-Driven Approach with Attention-Based Cues CHAMVEHA ET AL.: SOCIAL GROUP DISCOVERY FROM SURVEILLANCE VIDEOS 1 Social Group Discovery from Surveillance Videos: A Data-Driven Approach with Attention-Based Cues Isarun Chamveha 1 isarunc@iis.u-tokyo.ac.jp

More information

Multi-attention Guided Activation Propagation in CNNs

Multi-attention Guided Activation Propagation in CNNs Multi-attention Guided Activation Propagation in CNNs Xiangteng He and Yuxin Peng (B) Institute of Computer Science and Technology, Peking University, Beijing, China pengyuxin@pku.edu.cn Abstract. CNNs

More information

arxiv: v1 [cs.lg] 4 Feb 2019

arxiv: v1 [cs.lg] 4 Feb 2019 Machine Learning for Seizure Type Classification: Setting the benchmark Subhrajit Roy [000 0002 6072 5500], Umar Asif [0000 0001 5209 7084], Jianbin Tang [0000 0001 5440 0796], and Stefan Harrer [0000

More information

Computational Cognitive Neuroscience

Computational Cognitive Neuroscience Computational Cognitive Neuroscience Computational Cognitive Neuroscience Computational Cognitive Neuroscience *Computer vision, *Pattern recognition, *Classification, *Picking the relevant information

More information