Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

Similar documents
arxiv: v1 [q-bio.nc] 12 Jun 2014

Computational Principles of Cortical Representation and Development

Neural Mechanisms Underlying Visual Object Recognition

using deep learning models to understand visual cortex

Using population decoding to understand neural content and coding

Representational similarity analysis

Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision

Elias B Issa, PhD. Curriculum Vitae 08/2016. RESEARCH INTEREST The neural mechanisms and computational principles underlying high-level vision

Adventures into terra incognita

Elias B Issa, PhD. Curriculum Vitae 10/2015

Chapter 7: First steps into inferior temporal cortex

Explicit information for category-orthogonal object properties increases along the ventral stream

RT for Object Categorization Is Predicted by Representational Distance

Object recognition and hierarchical computation

From feedforward vision to natural vision: The impact of free viewing and clutter on monkey inferior temporal object representations

Error Detection based on neural signals

Searching for the visual components of object perception

Visual Categorization: How the Monkey Brain Does It

Adapting deep neural networks as models of human visual perception

An Artificial Neural Network Architecture Based on Context Transformations in Cortical Minicolumns

A quantitative theory of immediate visual recognition

Dynamics of Scene Representations in the Human Brain revealed by MEG and Deep Neural Networks

Introduction to Computational Neuroscience

Word Length Processing via Region-to-Region Connectivity

Invariant Recognition Shapes Neural Representations of Visual Input

Supplemental Information: Adaptation can explain evidence for encoding of probabilistic. information in macaque inferior temporal cortex

Network-based pattern recognition models for neuroimaging

A Neural Model of Context Dependent Decision Making in the Prefrontal Cortex

STDP-based spiking deep convolutional neural networks for object recognition

Ch.20 Dynamic Cue Combination in Distributional Population Code Networks. Ka Yeon Kim Biopsychology

Deep Networks and Beyond. Alan Yuille Bloomberg Distinguished Professor Depts. Cognitive Science and Computer Science Johns Hopkins University

Quiroga, R. Q., Reddy, L., Kreiman, G., Koch, C., Fried, I. (2005). Invariant visual representation by single neurons in the human brain, Nature,

Neural representation of action sequences: how far can a simple snippet-matching model take us?

On the Selective and Invariant Representation of DCNN for High-Resolution. Remote Sensing Image Recognition


Rolls,E.T. (2016) Cerebral Cortex: Principles of Operation. Oxford University Press.

A general error-based spike-timing dependent learning rule for the Neural Engineering Framework

Introduction Related Work Dataset & Features

Introduction to Computational Neuroscience

arxiv: v1 [q-bio.nc] 8 Jun 2016

THE ENCODING OF PARTS AND WHOLES

IN this paper we examine the role of shape prototypes in

Analysis of in-vivo extracellular recordings. Ryan Morrill Bootcamp 9/10/2014

Reach and grasp by people with tetraplegia using a neurally controlled robotic arm

Single cell tuning curves vs population response. Encoding: Summary. Overview of the visual cortex. Overview of the visual cortex

Electrophysiological and firing properties of neurons: categorizing soloists and choristers in primary visual cortex

Position invariant recognition in the visual system with cluttered environments

A quantitative theory of immediate visual recognition

Title: Ultra-Rapid Serial Visual Presentation Reveals Dynamics of Feedforward and Feedback Processes in the Ventral Visual Pathway Authors:

The Integration of Features in Visual Awareness : The Binding Problem. By Andrew Laguna, S.J.

Emergence of transformation tolerant representations of visual objects in rat lateral extrastriate cortex. Davide Zoccolan

Cognitive Modelling Themes in Neural Computation. Tom Hartley

Neural Coding. Computing and the Brain. How Is Information Coded in Networks of Spiking Neurons?

The perirhinal cortex and long-term familiarity memory

Bottom-up and top-down processing in visual perception

Automatic Classification of Perceived Gender from Facial Images

Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey

Overview of the visual cortex. Ventral pathway. Overview of the visual cortex

Supplementary materials for: Executive control processes underlying multi- item working memory

Neurobiology of Hearing (Salamanca, 2012) Auditory Cortex (2) Prof. Xiaoqin Wang

Spontaneous Cortical Activity Reveals Hallmarks of an Optimal Internal Model of the Environment. Berkes, Orban, Lengyel, Fiser.

EDGE DETECTION. Edge Detectors. ICS 280: Visual Perception

Color representation in CNNs: parallelisms with biological vision

Computer Science and Artificial Intelligence Laboratory Technical Report. July 29, 2010

Attentive Stereoscopic Object Recognition

Beyond bumps: Spiking networks that store sets of functions

Lecture overview. What hypothesis to test in the fly? Quantitative data collection Visual physiology conventions ( Methods )

Just One View: Invariances in Inferotemporal Cell Tuning

Memory, Attention, and Decision-Making

Frank Tong. Department of Psychology Green Hall Princeton University Princeton, NJ 08544

Categorization in IT and PFC: Model and Experiments

Pattern-information analysis: from stimulus decoding to computational-model testing

A Detailed Look at Scale and Translation Invariance in a Hierarchical Neural Model of Visual Object Recognition

Prediction of Successful Memory Encoding from fmri Data

Geography of the Forehead

Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey

Evidence that the ventral stream codes the errors used in hierarchical inference and learning Running title: Error coding in the ventral stream

Machine learning for neural decoding

NMF-Density: NMF-Based Breast Density Classifier

Shape Representation in V4: Investigating Position-Specific Tuning for Boundary Conformation with the Standard Model of Object Recognition

Supplemental Material

arxiv: v1 [stat.ml] 23 Jan 2017

Massachusetts Institute of Technology and University of Cambridge Sir Henry Wellcome postdoctoral fellow

Shu Kong. Department of Computer Science, UC Irvine

Spectro-temporal response fields in the inferior colliculus of awake monkey

arxiv: v2 [cs.cv] 22 Mar 2018

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

Probabilistic Models of the Cortex: Stat 271. Alan L. Yuille. UCLA.

Observational Learning Based on Models of Overlapping Pathways

Shu Kong. Department of Computer Science, UC Irvine

B657: Final Project Report Holistically-Nested Edge Detection

The dynamics of invariant object recognition in the human visual system

Efficient Deep Model Selection

Roozbeh Kiani, M.D., Ph.D.

Precise Spike Timing and Reliability in Neural Encoding of Low-Level Sensory Stimuli and Sequences

Inferential Brain Mapping

Rajeev Raizada: Statement of research interests

Correlation at the neuron and population levels. Correlation at the neuron and population levels. Recording from LGN/V1.

Transcription:

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition Charles F. Cadieu, Ha Hong, Daniel L. K. Yamins, Nicolas Pinto, Diego Ardila, Ethan A. Solomon, Najib J. Majaj, James J. DiCarlo Presented by Te-Lin Wu and William Shen

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Take Aways and Conclusions

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Take Aways and Conclusions

Motivations Are representational performances of Deep Neural Networks (DNNs) matching that of Visual IT Cortex (on object recognition)? Can we understand primate visual processing through DNNs?

Introduction Primate Vision: Performing remarkably even with constraints The key: Cortical Ventral Stream creates Representation! Ventral Stream transforms non-linear object recognition into neural representation separate object based on categories: ConvNet fc7!? Series of recapitulated modules of non-linear transformations Bio-inspired models How close are they to human brains?

Human Visual Cortex

From V1 to IT

4 advantages of this work Corrected experimental limitations (Noise, number of recorded neural sites) Measure the accuracy of a representation as a function of complexity Variations in the model/neural spaces relevant to object classification Hung CP, Kreiman G, Poggio T, DiCarlo JJ (2005) Fast readout of object identity from macaque inferior temporal cortex. Science 310: 863 866. Rust NC, DiCarlo JJ (2010) Selectivity and tolerance ( invariance ) both increase as visual information propagates from cortical area V4 to IT. Journal of Neuroscience 30: 12978 12995. Yamins DLK, Hong H, Cadieu CF, Solomon EA, Seibert D, et al. (2014) Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences 111: 8619 8624. Kriegeskorte N, Mur M, Ruff DA, Kiani R, Bodurka J, et al. (2008) Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey. Neuron 60: 1126 1141. Kriegeskorte N, Mur M, Bandettini P (2008) Representational Similarity Analysis Connecting the Branches of Systems Neuroscience. Frontiers in Systems Neuroscience 2. Yamins D, Hong H, Cadieu CF, DiCarlo JJ (2013) Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream. Advances in Neural Information Processing Systems0020: 3093 3101. Larger Dataset than previous related works Rust NC, DiCarlo JJ (2010) Selectivity and tolerance ( invariance ) both increase as visual information propagates from cortical area V4 to IT. Journal of Neuroscience 30: 12978 12995.

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Take Aways and Conclusions

Definitions Single-unit recording Multi-unit recording V1-Like: Models that try to capture first order account of primary visual cortex V2-Like: Corresponds to visual area V2, includes non-linearity and averaging HMAX: Bio-inspired hierarchical model utilizing sparse localized features HMO: Hierarchical Modular Optimization, combination of ConvNets, adaptive boosting procedure for hyper-parameters tuning AlexNet Zeiler and Fergus: Visualizing and Understanding Convolutional Networks

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Take Aways and Conclusions

Kernel Analysis To measure the accuracy of a representation as a function of complexity Definition of a good representation? Performance: Leave-one-out generalization error ( = Regularization parameter) looe( ) = 1 / as the complexity 1- looe( ) as the precision

Kernel Analysis cont d Procedure: A learning problem p(x,y), a set of n data points independently drawn Representation x (x), x:images, y: normalized labels Compute the Kernel Matrix using Gaussian Kernel Solve the regression problem: For a fixed σ and : Obtain the solution:

Other methods Adding a level of noise to the model representations that matched the noise generated from the neural representations Linear SVM utilizing the representations Representation Similarity: Representation Dissimilarity Matrix: Relationship between two RDMs

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Take Aways and Conclusions

Dataset 1960 tested images Image Variations: Object exemplar Geometric transformations (Position / Scale / Rotation / Pose) Background

Neural Data Collection Collected from V4 and IT using multi-electrode array Image presentation: one for 100ms Multi-unit representations Raw ring rates by counting number of spikes Subtract the background firing rate Normalization Take the mean across the repetitions of each image Single-unit representations Spike-Sorting Isolated 160 single-units from IT 95 single-units from V4

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Decoding analysis Encoding analysis Take Aways and Conclusions

Decoding--Kernel Analysis Result Comparison between different machine representation: DNN performs significantly better than other neural biology inspired models

Decoding--Kernel Analysis Result IT Cortex After sub-sampling to same feature number, we can see that DNN s representation has comparable performance as that of IT cortex (both in multi-unit and single-unit).

Decoding--At different feature sample numbers

Decoding--Linear SVM

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Decoding analysis Encoding analysis Take Aways and Conclusions

Encoding--Predicting IT Responses Interestingly although DNN s representations have much better decoding ability than V4 cortex s; DNN has similar performance in encoding task as V4 cortex.

Encoding-- Representational Dissimilarity Matrices (RDMs)

Encoding--Similarity to IT Dissimilarity Matrix IT-fit means adding a linear transform on model representation to predict IT representation We can see that without IT-fit, DNN s representation is very different from IT cortex representation; upon linear transform, DNN s representation falls within noise range of IT cortex. This further proves that although there remains a gap between DNN models and IT cortex representation, DNN representation has the encoding power to form IT cortex s representation.

Limitations: Viewing Time: 100ms Passive Viewing vs. Active Task Performance Visual Experience and Learning (Macaque lacks experience with a number of classes) Energy Efficiency Comparison (model energy requirements are 2 to 3 orders of magnitude higher than the primate visual system) Natural Primate Development vs 15M Labeled Images

Outline Introduction and Motivations Definition of Terms Methods Data Experimental Results Decoding analysis Encoding analysis Take Aways and Conclusions

Conclusion DNN s representational performance is comparable with that of IT cortex!

Questions?

Thanks!