Region Proposals. Jan Hosang, Rodrigo Benenson, Piotr Dollar, Bernt Schiele

Size: px

Start display at page:

Download "Region Proposals. Jan Hosang, Rodrigo Benenson, Piotr Dollar, Bernt Schiele"

Jocelyn Wilcox
5 years ago
Views:

1 Region Proposals Jan Hosang, Rodrigo Benenson, Piotr Dollar, Bernt Schiele

2 Who has read a proposal paper? 2

3 Who has read a proposal paper? Who knows what Average Recall is? 2

4 Who has read a proposal paper? Who knows what Average Recall is? Who has used proposals? 2

5 Who has read a proposal paper? Who knows what Average Recall is? Who has used proposals? Who has written a paper about proposals or a paper using proposals? 2

6 Part I: Motivation & different types of proposal methods 3

7 Sliding Window Approach Slide a window and classify every location 4

8 Sliding Window Approach Slide a window and classify every location 4

9 Sliding Window Approach Slide a window and classify every location Huge search space: position x scale x aspect ratio, about Problem: classification needs to be very fast 10 6 windows 5

10 Conventional detection Image Detections with score Sliding window Classifier/Detector Discard unlikely object locations 6

11 Detection proposals: a cascaded detector Image Image & search space Detections with score Proposal method Classifier/Detector Discard unlikely object locations Discard unlikely object locations 7

12 Motivation: Efficiency Image Propose object locations Image & search space Score all proposals Detections with score Reducing search space allows slower classifiers But it comes with the cost of running a proposal method: about 0.1s 4min per image More side effects later 8

13 Approaches to propose object segments Bottom up: grouping {superpixels, pixels} into object segments Selective Search [ICCV 11, IJCV 13], Randomized Prim s [ICCV 13], [Rantalankila CVPR 14], [Chang ICCV 11], CPMC [CVPR 10, PAMI 12], [Endres ECCV 10, PAMI 14], Rigor [CVPR 14], Geodesic [ECCV 14], MCG [CVPR 14], 9

14 Grouping method example: Selective Search Used information: color similarities, texture similarities, region size, region filling Greedy merging Nothing learned 10

15 Approaches to propose object segments Bottom up: grouping {superpixels, pixels} into object segments Selective Search [ICCV 11, IJCV 13], Randomized Prim s [ICCV 13], [Rantalankila CVPR 14], [Chang ICCV 11], CPMC [CVPR 10, PAMI 12], [Endres ECCV 10, PAMI 14], Rigor [CVPR 14], Geodesic [ECCV 14], MCG [CVPR 14], Top down: hypothesize, then score a window/segment Objectness [CVPR 10, PAMI 12], [Rahtu ICCV 11], Bing [CVPR 14], EdgeBoxes [ECCV 14], [Feng ICCV 11], [Zhang CVPR 11], Randomized Seeds [ICCV 13], 11

16 Window scoring example: Edge Boxes Extract edges Group edges Extract edges Minimize the number of edges that cross the window boundary Sliding window search Refine location, scale, aspect ratio Also nothing learned 12

17 Approaches to propose object segments Bottom up: grouping {superpixels, pixels} into object segments Selective Search [ICCV 11, IJCV 13], Randomized Prim s [ICCV 13], [Rantalankila CVPR 14], [Chang ICCV 11], CPMC [CVPR 10, PAMI 12], [Endres ECCV 10, PAMI 14], Rigor [CVPR 14], Geodesic [ECCV 14], MCG [CVPR 14], Top down: hypothesize, then score a window/segment Objectness [CVPR 10, PAMI 12], [Rahtu ICCV 11], Bing [CVPR 14], EdgeBoxes [ECCV 14], [Feng ICCV 11], [Zhang CVPR 11], Randomized Seeds [ICCV 13], Miscellanea: eg. non-parametric, convnets Shape sharing [ECCV 12], Multibox [CVPR 14], DeepMask [NIPS 15], 13

18 Why are convnets different? Many methods are complexity controlled to generalize to unseen classes Simple features Simple classifier, maybe even no learning Huge VGG convnet with tens of millions of parameters? Possibility to learn n detectors inside Bigger drop in performance when generalizing to new classes But you could also count it as window scoring More on that later 14

19 Part II: Detection proposal evaluation, metrics, analysis 15

20 Detection proposals: a cascaded detector Image Image & search space Detections with score Proposal method Classifier/Detector Discard unlikely object locations Discard unlikely object locations 16

21 Detection proposals: types of mistakes Image Image & search space Detections with score false positives Proposal method Classifier/Detector false negatives 17

22 Detection proposals: types of mistakes Image Image & search space Detections with score false positives Proposal method Classifier/Detector false negatives 17

23 Detection proposals: evaluation Image Propose object locations Image & search space Score all proposals Detections with score Methods need to provide As few FNs as possible: high recall As well localized as possible: good localization As few FPs as possible: few proposals 18

24 Why do these three matter? Image Propose object locations Image & search space Score all proposals Detections with score High recall Missing recall cannot be undone Good localization Detectors give better scores for well localized windows, ie. better AP curve Few proposals This cuts down search space And reduces the number of mistakes the detector can make 19

25 Typical evaluation: fix #proposals, recall vs. IoU recall PASCAL 2007 recall better Bing CPMC EdgeBoxes Endres Geodesic MCG Objectness Rahtu RandomizedPrims Rantalankila Rigor SelectiveSearch Gaussian Sliding window Superpixels Uniform IoU overlap threshold (a) 100 proposals per image IoU overlap threshold (b) 1000proposals per image. Bing CPMC EdgeBoxes Endres Geodesic 20

26 Proposals methods are just bad detectors In the early days authors cared about class generalization In the Objectness paper [CVPR 10] set of training/test classes were disjoint Methods had either simple features or small model capacity to generalize 21

27 Proposals methods are just bad detectors In the early days authors cared about class generalization In the Objectness paper [CVPR 10] set of training/test classes were disjoint Methods had either simple features or small model capacity to generalize But on Pascal/COCO-type benchmarks Who cares about class generalization? You want high recall for few proposals At the end of the day a high recall detector is the best proposal method (e.g. Faster R-CNN) recall SquaresChnFtrs 2.8 EdgeBoxes 3.0 EdgeBoxes EdgeBoxes EdgeBoxes IoU 22

28 Proposals methods are just bad detectors In the early days authors cared about class generalization In the Objectness paper [CVPR 10] set of training/test classes were disjoint Methods had either simple features or small model capacity to generalize But on Pascal/COCO-type benchmarks Who cares about class generalization? You want high recall for few proposals At the end of the day a high recall detector is the best proposal method (e.g. Faster R-CNN) What you want to detect might not be an object (and violate the common appearance assumption ) Parts of objects: shoulder, car door, Groups of objects recall SquaresChnFtrs 2.8 EdgeBoxes 3.0 EdgeBoxes EdgeBoxes EdgeBoxes IoU 23

29 Typical evaluation: fix #proposals, recall vs. IoU recall PASCAL 2007 recall better Bing CPMC EdgeBoxes Endres Geodesic MCG Objectness Rahtu RandomizedPrims Rantalankila Rigor SelectiveSearch Gaussian Sliding window Superpixels Uniform IoU overlap threshold (a) 100 proposals per image IoU overlap threshold (b) 1000proposals per image. Bing CPMC EdgeBoxes Endres Geodesic 24

30 But how important is recall vs. localization? recall IoU overlap threshold 25

31 There is no one threshold, Better localization helps linearly detector score IoU with GT R-CNN LM-LLDA Fast R-CNN 26

32 very different mechanisms for generating proposals. ctivesearch merges superpixels, Rigor computes iple graph cut segmentations, MCG generates hierarchegmentations, Actually, and we re EdgeBoxes interested scores windows map based ge content. ralisation: Critically, we measured no significant drop call when going from the 20 PASCAL categories to the 0.9 magenet categories. Moreover, while MS COCO is subially harder and has very different 0.8 statistics (more and ler objects), relative method ordering remains mostly 1 le Experiments: The best Fast R-CNN IoU overlap results threshold reported 0.9 is paper used the large model L and EdgeBoxesAR oved0.5performance to 71.2 map, and adding an oracle 0.4 erfect non-maximum suppression further improved 0.3 to 78.1 (see 5.6 for details). The remaining gap of recall correlation with map anged. These are encouraging result 0.7 indicating that curethods do indeed generalise to different unseen categories, s such can be considered true objectness methods which is the same as the average b to reach perfect detectionminor, is caused we opted by high to use scoring AR over average the entire best o2[0.5,1.0] spatial range support from 0.5 (BSS) to 0.1 tions 1.0 for simplicity. While the strong correlation does not imply that that the arear perfectly can perfectly local- predict the closest detection proposal performance, to each ann 0on the background and object misclassifications. The ABO and BSS are typica best case analysis IoU overlap threshold for proposals as figure 13 shows, the relationship matchismore surprisingly than one linear. annotatio AR (b) proposals per image. shows that further improvement Tools for Efficient Object candetection only be Region gained Proposals by Jan Hosang 1 mask generated by many prop more important role during det progress has been rapid in this y proposal methods to evolve quic APPENDIX 60 correlation=0.961 A ANALYSIS OF METRICS map 50 Average recall (AR) between 0.5 a 40 by averaging over the overlaps o the closest 30 matched proposal, th axis 20 of the plot instead of the x ax and recall(o) the function shown Let IoU(gt i ) denote the IoU betw average recall the closest detection proposal. W (c) Fast R-CNN without bounding box regression osals, achieving map of 67.8 on PASCAL 2007 test. 0.7 ˆ1 g an oracle to rectify all localisation Figure 13: and Correlation recall errors between detector performance on P correlation between map and recall AR =2 at different recall(o)do IoU threshold = 2 nx n 0.5 = average recall(o) i=1 27

33 Trends across number of proposals average recall # proposals (c) Average recall (between IoU). Bing CPMC EdgeBoxes Endres Geodesic MCG Objectness Rahtu RandomizedPrims Rantalankila Rigor SelectiveSearch Gaussian Sliding window Superpixels Uniform 28

34 Average Recall caveat Correlates well if you don t change the number of proposals AR =2 ˆ1 0.5 recall(o)do = average o2[0.5,1.0] recall(o) But extreme cases don t look as you d expect AR # proposals mean AP EdgeBoxesAR nms oracle 53 ~ gt oracle 100 ~ both oracles 100 ~

35 Take-home messages Proposals should provide high recall and good localization with few proposals In papers present full range of operating points AR is a good summary number, but has limitations when it comparing across different number of proposals Sometimes a detector is the better proposal method If you care about few classes and don t need to generalize If the proposal methods don t work on your targets 30

36 Thanks for your attention! Questions? 31

Active Deformable Part Models Inference

Active Deformable Part Models Inference Menglong Zhu Nikolay Atanasov George J. Pappas Kostas Daniilidis GRASP Laboratory, University of Pennsylvania 3330 Walnut Street, Philadelphia, PA 19104, USA Abstract.