Dual Path Network and Its Applications

Similar documents
CSE Introduction to High-Perfomance Deep Learning ImageNet & VGG. Jihyung Kil

Efficient Deep Model Selection

Automated diagnosis of pneumothorax using an ensemble of convolutional neural networks with multi-sized chest radiography images

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

arxiv: v2 [cs.cv] 19 Dec 2017

Multi-attention Guided Activation Propagation in CNNs

Revisiting RCNN: On Awakening the Classification Power of Faster RCNN

Convolutional Neural Networks (CNN)

DIAGNOSTIC CLASSIFICATION OF LUNG NODULES USING 3D NEURAL NETWORKS

Flexible, High Performance Convolutional Neural Networks for Image Classification

Motivation: Attention: Focusing on specific parts of the input. Inspired by neuroscience.

Deep Compression and EIE: Efficient Inference Engine on Compressed Deep Neural Network

B657: Final Project Report Holistically-Nested Edge Detection

Hierarchical Convolutional Features for Visual Tracking

arxiv: v1 [cs.cv] 30 May 2018

DeepLung: 3D Deep Convolutional Nets for Automated Pulmonary Nodule Detection and Classification

COMP9444 Neural Networks and Deep Learning 5. Convolutional Networks

The Impact of Visual Saliency Prediction in Image Classification

Learning Convolutional Neural Networks for Graphs

Healthcare Research You

Convolutional and LSTM Neural Networks

arxiv: v2 [cs.cv] 7 Jun 2018

Simultaneous Estimation of Food Categories and Calories with Multi-task CNN

arxiv: v1 [cs.cv] 13 Jul 2018

Collaborative Learning for Weakly Supervised Object Detection

Computational modeling of visual attention and saliency in the Smart Playroom

Y-Net: Joint Segmentation and Classification for Diagnosis of Breast Biopsy Images

Comparison of Two Approaches for Direct Food Calorie Estimation

HHS Public Access Author manuscript Med Image Comput Comput Assist Interv. Author manuscript; available in PMC 2018 January 04.

Image Captioning using Reinforcement Learning. Presentation by: Samarth Gupta

Convolutional Neural Networks for Text Classification

Using Deep Convolutional Networks for Gesture Recognition in American Sign Language

Supplementary Online Content

Highly Accurate Brain Stroke Diagnostic System and Generative Lesion Model. Junghwan Cho, Ph.D. CAIDE Systems, Inc. Deep Learning R&D Team

Generative Adversarial Networks.

Collaborative Learning for Weakly Supervised Object Detection

arxiv: v1 [cs.cv] 17 Aug 2017

Differentiating Tumor and Edema in Brain Magnetic Resonance Images Using a Convolutional Neural Network

arxiv: v3 [cs.cv] 26 May 2018

arxiv: v2 [cs.cv] 29 Jan 2019

Progressive Attention Guided Recurrent Network for Salient Object Detection

Convolutional and LSTM Neural Networks

Attentional Masking for Pre-trained Deep Networks

arxiv: v1 [stat.ml] 23 Jan 2017

Final Report: Automated Semantic Segmentation of Volumetric Cardiovascular Features and Disease Assessment

arxiv: v2 [cs.cv] 3 Jun 2018

Doppelganger Mining for Face Representation Learning

Shu Kong. Department of Computer Science, UC Irvine

Improving the Interpretability of DEMUD on Image Data Sets

arxiv: v1 [cs.cv] 24 Jul 2018

Laser Scar Detection in Fundus Images using Convolutional Neural Networks

Retinopathy Net. Alberto Benavides Robert Dadashi Neel Vadoothker

Viewpoint Invariant Convolutional Networks for Identifying Risky Hand Hygiene Scenarios

Classification of breast cancer histology images using transfer learning

A DEEP LEARNING METHOD FOR DETECTING AND CLASSIFYING BREAST CANCER METASTASES IN LYMPH NODES ON HISTOPATHOLOGICAL IMAGES

Reverse Attention for Salient Object Detection

Task-driven Webpage Saliency

Convolutional Neural Networks for Estimating Left Ventricular Volume

A NEW HUMANLIKE FACIAL ATTRACTIVENESS PREDICTOR WITH CASCADED FINE-TUNING DEEP LEARNING MODEL

The University of Tokyo, NVAIL Partner Yoshitaka Ushiku

ImageCLEF2018: Transfer Learning for Deep Learning with CNN for Tuberculosis Classification

Multi-task Learning of Dish Detection and Calorie Estimation

Video Saliency Detection via Dynamic Consistent Spatio- Temporal Attention Modelling

On Training of Deep Neural Network. Lornechen

Skin cancer reorganization and classification with deep neural network

Multi-Scale Salient Object Detection with Pyramid Spatial Pooling

arxiv: v1 [cs.cv] 9 Oct 2018

Image-Based Estimation of Real Food Size for Accurate Food Calorie Estimation

Question 1 Multiple Choice (8 marks)

arxiv: v1 [cs.cv] 21 Nov 2018

Age Estimation based on Multi-Region Convolutional Neural Network

Elad Hoffer*, Itay Hubara*, Daniel Soudry

Segmentation of Cell Membrane and Nucleus by Improving Pix2pix

Genome-wide association study of esophageal squamous cell carcinoma in Chinese subjects identifies susceptibility loci at PLCE1 and C20orf54

Tumor Cellularity Assessment. Rene Bidart

Factoid Question Answering

Deep Learning based FACS Action Unit Occurrence and Intensity Estimation

Big Image-Omics Data Analytics for Clinical Outcome Prediction

arxiv: v1 [cs.cv] 2 Jun 2017

Holistically-Nested Edge Detection (HED)

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

The Nottingham eprints service makes this work by researchers of the University of Nottingham available open access under the following conditions.

As entered modern society, the interactions. A Study on Deep Learning Based Sauvegrain Method for Measurement of Puberty Bone Age

Beyond R-CNN detection: Learning to Merge Contextual Attribute

Exploratory Study on Direct Prediction of Diabetes using Deep Residual Networks

Action Recognition. Computer Vision Jia-Bin Huang, Virginia Tech. Many slides from D. Hoiem

Recommending Outfits from Personal Closet

arxiv: v1 [cs.cv] 21 Jul 2017

ARTIFICIAL INTELLIGENCE FOR DIGITAL PATHOLOGY. Kyunghyun Paeng, Co-founder and Research Scientist, Lunit Inc.

arxiv: v1 [cs.cv] 2 Aug 2017

Sequential Predictions Recurrent Neural Networks

Implementing a Robust Explanatory Bias in a Person Re-identification Network

Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision

arxiv: v1 [cs.cv] 16 Jan 2018

Recurrent Neural Networks

Methods for Segmentation and Classification of Digital Microscopy Tissue Images

Object Detectors Emerge in Deep Scene CNNs

Transcription:

Learning and Vision Group (NUS), ILSVRC 2017 - CLS-LOC & DET tasks Dual Path Network and Its Applications National University of Singapore: Yunpeng Chen, Jianan Li, Huaxin Xiao, Jianshu Li, Xuecheng Nie, Xiaojie Jin, Jiashi Feng Qihoo 360 AI Institute: Jian Dong, Shuicheng Yan Speaker: Jianshu Li Thank Min Lin, Qiang Chen from Qihoo 360 for the extensive discussions. Thank Xiaoli Liu, Ying Liu from Qihoo 360 for helping collect and annotate "external" data.

Results Overview Object Localization a) with "provided" data: 1 st place ( Loc Error: 6.23% ) b) with "external" data: 1 st place ( Loc Error: 6.19% ) Object Detection a) with "provided" data: 2 nd place ( by map: 65.8% ) b) with "external" data: 2 nd place ( by map: 65.8% ) Object Detection from video (VID) a) with "provided" data: 2 nd place ( by map: 75.8% ) b) with "external" data: 2 nd place ( by map: 76.0% )

Dual Path Networks (DPN) https://arxiv.org/abs/1707.01629 https://github.com/cypw/dpns

Motivation [1] * Here, a 1 1 convolutional layer (underlined) is added for consistency with the micro-block design in (a). [1]: G Huang, et al. "Densely Connected Convolutional Networks". CVPR 2017

Motivation [1] [1]: G Huang, et al. "Densely Connected Convolutional Networks". CVPR 2017

Analysis Share Info....... Cross-layer parameter sharing

Analysis When green arrows share parameters, they produce exactly the same outputs. Share Info.......

Analysis When green arrows share parameters, they produce exactly the same outputs. Thus, some computations are redundant here. Share Info.......

Analysis When green arrows share parameters, they produce exactly the same outputs. Thus, some computations are redundant here. We can add a new path to temporarily Share save Info the outputs of green arrows for reuse, and only execute the operation in the orange arrows........

Analysis A new path to temporarily save the outputs from the green arrows for reuse. Simplify (c) Densely Connected Network ( with shared connections )

Analysis A new path to temporarily save the outputs from the green arrows for reuse. A residual unit (c) Densely Connected Network ( with shared connections )

A new path to temporarily save the outputs from the green arrows for reuse. Analysis ( Residual Path ) (c) Densely Connected Network ( with shared connections )

Analysis Cross-layer parameter sharing

Analysis Residual Networks are essentially Densely Connected Networks but with shared connections. DenseNets ResNets

Analysis Residual Networks are essentially Densely Connected Networks but with shared connections. DenseNets ResNets Advantage: - ResNet: features refinement (reuse feature) - DenseNet: keep exploring new features

Dual Path Architecture Senior Employees Freshman When managing a company: Employees need to keep improving the skills (Feature refinement) Also need to hire freshman to the company (Feature exploration)

~ ~ Dual Path Architecture (d) Dual Path Architecture

~ ~ Dual Path Architecture Explore New Features (d) Dual Path Architecture Feature Refinement (Reuse Features)

~ ~ Dual Path Networks (e) DPN

~ ~ Dual Path Networks The sub-network can be replaced by any micro-structures, not necessarily a bottleneck structure (e) DPN

~ ~ Dual Path Networks Three DPNs are designed: DPN-92, DPN-98, DPN-131 depth=92 depth=98 depth=131 (e) DPN

~ ~ Dual Path Networks Three DPNs are designed: DPN-92, DPN-98, DPN-131 depth=92 depth=98 depth=131 ResNeXt-101 (64x4d) [2] DPN-98 320 MB Model Size 236 MB - 26% 15.5 GFLOPs 11.7-25% 12.1 GB 20.4 / 5.3 GPU Memory Top 1 / Top 5 Error. 11.1 GB 20.2 / 5.2-8% (e) DPN * Testing scale: x224 / Batch Size: 32 per GPU [2]: S Xie, et al. "Aggregated residual transformations for deep neural networks." CVPR. 2017.

Performance Single model, Single center-crop, Top-5 val error rate 4.4% 4.25% 4.16% DPN-131 is Fast!! Training = 60 img/sec (per node, 4 x K80 cards) ResNeXt-101 (64x4d) [2] Very Deep PolyNet [3] DPN-131 ( Best single model reported ) *Testing scale: x299 / x320 [3] X Zhang, et al. "Polynet: A pursuit of structural diversity in very deep networks" arxiv 2016

ILSVRC 2017

ILSVRC 2017: Object Localization & Detection Main Framwork: Input Image Ensembled CLS Models DPN-92, DPN-98, DPN-131 ResNeXt-101(64x4d), CRU-Net, DenseNet,... Multi-sacle Dense Testing Weighted Sum Confidence of each Class Input Image Ensembled Faster R-CNNs[4] Loc/Det Results Simplest Faster R-CNN Pipleline Weighted Sum DPN-92, DPN-98, DPN-131 [4] S Ren, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." NIPS. 2015

ILSVRC 2017: Object Localization Visualization:

ILSVRC 2017: Object Detection Visualization:

Thank You! National University of Singapore: Yunpeng Chen, Jianan Li, Huaxin Xiao, Jianshu Li, Xuecheng Nie, Xiaojie Jin, Jiashi Feng Qihoo 360 AI Institute: elefjia@nus.edu.sg Jian Dong, Shuicheng Yan yanshuicheng@360.cn

Q & A