Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks

Similar documents
Towards Open Set Deep Networks: Supplemental

Comparison of Two Approaches for Direct Food Calorie Estimation

Evaluation of the Impetuses of Scan Path in Real Scene Searching

Emotion Recognition using a Cauchy Naive Bayes Classifier

Rachael E. Jack, Caroline Blais, Christoph Scheepers, Philippe G. Schyns, and Roberto Caldara

Skin cancer reorganization and classification with deep neural network

Facial expression recognition with spatiotemporal local descriptors

Analyzing Personality through Social Media Profile Picture Choice

Pros & Cons of Testimonial Evidence. Presentation developed by T. Trimpe

Fine-Grained Image Classification Using Color Exemplar Classifiers

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Main Functions maintain homeostasis

arxiv: v2 [cs.cv] 2 Aug 2018

Fusing with Context: A Bayesian Approach to Combining Descriptive Attributes

Learning to Disambiguate by Asking Discriminative Questions Supplementary Material

COLLECTION TUBES FOR PHLEBOTOMY

1 + denotes a higher variable value for the deletion or depletion than the control scenario; - denotes a lower value.

Objective research on tongue manifestation of patients with eczema

Fine-Grained Image Classification Using Color Exemplar Classifiers

Local Image Structures and Optic Flow Estimation

Generic object recognition May 17 th, 2018

Supplementary material: Backtracking ScSPM Image Classifier for Weakly Supervised Top-down Saliency

Designing a Personalised Case-Based Recommender System for Mobile Self-Management of Diabetes During Exercise

Cultural Differences in Cognitive Processing Style: Evidence from Eye Movements During Scene Processing

Is the straddle effect in contrast perception limited to secondorder spatial vision?

Analyzing Personality through Social Media Profile Picture Choice

Development of novel algorithm by combining Wavelet based Enhanced Canny edge Detection and Adaptive Filtering Method for Human Emotion Recognition

CS4495 Computer Vision Introduction to Recognition. Aaron Bobick School of Interactive Computing

Perceived similarity and visual descriptions in content-based image retrieval

Improving the Interpretability of DEMUD on Image Data Sets

Detection of Glaucoma and Diabetic Retinopathy from Fundus Images by Bloodvessel Segmentation

VISUAL PERCEPTION & COGNITIVE PROCESSES

STAT 100 Exam 2 Solutions (75 points) Spring 2016

Where should saliency models look next?

Identity Verification Using Iris Images: Performance of Human Examiners

Learning to Generate Long-term Future via Hierarchical Prediction. A. Motion-Based Pixel-Level Evaluation, Analysis, and Control Experiments

Introduction to biostatistics & Levels of measurement

Validity Issues of Bibliometric and Performance Indicators: What we can Learn from the Case of the Humanities

Fact as resistance. Applying Ludwik Fleck s Views to Contemporary Cognitive Science. The bedrock of reality (normally invisible to mortals)

Supplementary Creativity: Generating Diverse Questions using Variational Autoencoders

Affective Impact of Movies: Task Overview and Results

Study on Aging Effect on Facial Expression Recognition

Generating Visual Explanations

Comparison of Two Approaches for Direct Food Calorie Estimation

NMF-Density: NMF-Based Breast Density Classifier

Multi-attention Guided Activation Propagation in CNNs

Computational modeling of visual attention and saliency in the Smart Playroom

Demographics versus Biometric Automatic Interoperability

doi: /

Design of Palm Acupuncture Points Indicator

This is the accepted version of this article. To be published as : This is the author version published as:

arxiv: v1 [cs.cv] 28 Mar 2016

Color representation in CNNs: parallelisms with biological vision

8.2 Relative Frequency

Sentiment Analysis of Reviews: Should we analyze writer intentions or reader perceptions?

The color of night: Surface color categorization by color defective observers under dim illuminations

Automated Tessellated Fundus Detection in Color Fundus Images

Supporting Information

Colour Communication.

Quantifying Radiographic Knee Osteoarthritis Severity using Deep Convolutional Neural Networks

Decision Support System for Skin Cancer Diagnosis

Describing Clothing by Semantic Attributes

Nature Neuroscience: doi: /nn Supplementary Figure 1. Behavioral training.

EXPECTANT parents usually spend a great deal of time

Hand and Wrist Solutions. Arthroplasty Staple Fixation Screw Fixation Sterile Instruments

ANALYSIS AND DETECTION OF BRAIN TUMOUR USING IMAGE PROCESSING TECHNIQUES

Broad H3K4me3 is associated with increased transcription elongation and enhancer activity at tumor suppressor genes

Reading Assignments: Lecture 5: Introduction to Vision. None. Brain Theory and Artificial Intelligence

Salient Object Detection in RGB-D Image Based on Saliency Fusion and Propagation

Mammographic Breast Density Classification by a Deep Learning Approach

arxiv: v2 [cs.cv] 19 Dec 2017

A Comparison of Perceptions on the Investment Theory of Creativity between Chinese and American

Starting Basic Courses on Positive Psychotherapy in China

Explorers 1. Teacher s notes for the Comprehension Test: Little Red Riding Hood. Answer key 1c 2c 3a 4b 5c 6b 7c 8a 9c 10b 11c 12a

Main Study: Summer Methods. Design

Object-Level Saliency Detection Combining the Contrast and Spatial Compactness Hypothesis

A novel and automatic pectoral muscle identification algorithm for mediolateral oblique (MLO) view mammograms using ImageJ

Automated Detection of Vascular Abnormalities in Diabetic Retinopathy using Morphological Entropic Thresholding with Preprocessing Median Fitter

Alien Life Form (ALF)

DEEP CONVOLUTIONAL ACTIVATION FEATURES FOR LARGE SCALE BRAIN TUMOR HISTOPATHOLOGY IMAGE CLASSIFICATION AND SEGMENTATION

When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment

Measuring the attentional effect of the bottom-up saliency map of natural images

Modeling human color categorization: Color discrimination and color memory

Healthcare Research You

Computing Intensity Increments For Incremental Exercise Protocols

The Basic Senses and What They Detect. Energy senses Vision (electromagnetic energy light waves)

Direct to your goal.

Primo Ice Vials Silicone o-ring seal

Human Activity Understanding using Visibility Context

Object Detectors Emerge in Deep Scene CNNs

Action Recognition. Computer Vision Jia-Bin Huang, Virginia Tech. Many slides from D. Hoiem

Fine-grained Flower and Fungi Classification at CMP

2018 Keypoints Challenge

7th Grade Nutrition Activity

Utilizing Posterior Probability for Race-composite Age Estimation

SUPPLEMENTARY INFORMATION In format provided by Javier DeFelipe et al. (MARCH 2013)

Network Analysis of Toxic Chemicals and Symptoms: Implications for Designing First-Responder Systems

A Comparison of Collaborative Filtering Methods for Medication Reconciliation

FOSTERING TRAUMA-INFORMED LEADERSHIP SKILLS FOR CONSUMERS

CHAPTER 3 METHODOLOGY: THE CASE STUDY AND BILATERAL ART PROTOCOLS

Transcription:

Learning Multifunctional Binary Codes for Both Category and Attribute Oriented Retrieval Tasks Haomiao Liu 1,2, Ruiping Wang 1,2,3, Shiguang Shan 1,2,3, Xilin Chen 1,2,3 1 Key Laboratory of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Institute of Computing Technology, CAS, Beijing, 100190, China 2 University of Chinese Academy of Sciences, Beijing, 100049, China 3 Cooperative Medianet Innovation Center, China haomiao.liu@vipl.ict.ac.cn, {wangruiping, sgshan, xlchen}@ict.ac.cn This document gives details about the attributes defined on ImageNet-150K, and additional real retrieval results. This material is best viewed in color. 1. Example Images of Attributes In this section, we provide example images of each attribute defined on ImageNet-150K (25 attributes, including color, texture, shape, material, and structure). The attributes were defined and annotated mainly based on the ImageNet-attribute [S1] and Animals with Attribute (AwA) [S2] datasets. Compared to [S1], our dataset covers much more categories (1000 vs 384) and images (50,000 vs 9,600). For each attribute, three positive samples along with three negative samples are shown in Figure 1, 2, and 3 (the leftmost three in each row are positive samples and the rest are negative samples). In our experiments, the attributes are binary, namely, an image either has or does not have the attribute. 2. Real Retrieval Cases This section gives more real retrieval cases on the attribute-oriented retrieval tasks described in Sections 4.4 and 4.5 of the main paper (the results were obtained with 256-bit binary codes). 2.1. Results on CFW-60K The results of task II and task III on CFW-60K are shown in Figure 4 and 5 respectively. In task II, the system is required to retrieve images of subjects with the same gender, race, and age group as the subject in the query image. As we can see from the failed cases (Figure 4(b)), for each query image, though the top feedbacks fail to match the exact attributes of the query, all of them have the same gender, race, and age group. By further investigating the failed cases, we found that the main cause is the incorrect attribute predictions of the query image. In task III, either inaccurate attribute prediction or the incapability of binary codes in preserving category similarity would result in failed cases. Here we only show the successful cases to demonstrate the potential of our method in this challenging realistic retrieval scenario. 2.2. Results on ImageNet-150K The results on ImageNet-150K are shown in Figure 6 and 7. For this dataset, since there are only two images from each category in the Test set, to better evaluate our method for qualitative demonstration, in this supplemental experiment we used the Test set as query images, and retrieved images from both Train-Both and Train-Attribute sets. Some successful retrieval results on task II and task III are provided, suggesting that our method has the potential to be applied in these two realistic yet very challenging object retrieval scenarios. References [S1] O. Russakovsky, L. Fei-Fei. Attribute Learning in Large-scale Datasets. In European Conference on Computer Vision workshop (ECCVW), 2010, pages 1-14. 1

Black Blue Brown Gray Green Orange Pink Red Purple White Figure 1. Example images of attributes on ImageNet-150K. For each attribute, three positive samples (the leftmost three) and three negative samples (the rightmost three) are shown in this figure. [S2] C. H. Lampert,H. Nickisch,S. Harmeling. Learning to Detect Unseen Object Classes by Between-class Attribute Transfer. In Computer Vision and Pattern Recognition (CVPR), 2009, pages 951-958. 2

Yellow Colorful Stripes Columnar Rectangular Round Sharp Metal Wooden Figure 2. Example images of attributes on ImageNet-150K. For each attribute, three positive samples (the leftmost three) and three negative samples (the rightmost three) are shown in this figure. 3

Furry Tail Horn 2 Legs /Bipedal 4 Legs /Quadruped Figure 3. Example images of attributes on ImageNet-150K. For each attribute, three positive samples (the leftmost three) and three negative samples (the rightmost three) are shown in this figure. 4

Query Top-5 Bottom-5 Query Top-5 Bottom-5 (a) (b) Figure 4. Some real retrieval results of task II on CFW-60K (retrieving images of subjects with the same gender, race, and age group as the subject in the query image). The results were obtained with 256-bit binary codes. The notations are consistent with the main paper (please refer to Figure 1(b) in the main paper for details). (a) successful cases, (b) failed cases. In the failed cases, the predicted gender, race, and age group of the 3 query images are: 1) male + white + young (groundtruth: male + white + mid-aged), 2) female + Asian + young (groundtruth: female + white + young), 3) male + Asian + young (groundtruth: male + white + young). Note that as mentioned in our main paper, in this task II the predicted attributes of all images (both query and database images) were used for retrieval, while the evaluation is performed on the ground-truth attribute labels. As a result, both wrong predictions of the query image and the database images would cause a mismatch. + No Glasses + Eyeglasses + Sunglasses + Smile + Neutral Figure 5. Some real retrieval results of task III on CFW-60K. The notations are consistent with the main paper (please refer to Figure 1(b) in the main paper for details). 5

Query Top-5 Bottom-5 Horn Green Rect. Round Furry Brown Rect. Wooden Black Tail Furry Figure 6. Some real retrieval results of task II on ImageNet-150K, the interested attributes are listed on the right side of the query images, including color, texture, shape, material, and structure. Images in the Test set were used as queries, and the Train-Both set and TrainAttribute set were used as database. The results were obtained with 256-bit binary codes. The notations are consistent with the main paper (please refer to Figure 1(b) in the main paper for details). + Red + White + Round + White + Columnar + Tail Figure 7. Some real retrieval results of task III on ImageNet-150K. Images in the Test set was used as queries, and the Train-Both set and Train-Attribute set were used as database. The notations are consistent with the main paper (please refer to Figure 1(b) in the main paper for details). Note that for the last query, one of the top-10 feedbacks (that is bounded by red box) does not match the query in terms of category (the ground-truth category of the query is pelican respectively, while the ground-truth category of the wrong feedback is crane ). We can see that the wrong feedback looks very similar to the query image, even humans would have some difficulty to perform such a fine-grained categorization task, thus it is reasonable that the retrieval system made such a mistake. 6