Local Image Structures and Optic Flow Estimation Sinan KALKAN 1, Dirk Calow 2, Florentin Wörgötter 1, Markus Lappe 2 and Norbert Krüger 3 1 Computational Neuroscience, Uni. of Stirling, Scotland; {sinan,worgott}@cn.stir.ac.uk 2 Dept. of Psychology, Uni. of Münster, Germany; {calow,mlappe}@psy.uni-muenster.de 3 Computer Sci., Aalborg University Esbjerg, Denmark; nk@cs.aue.auc.dk Abstract. Different kinds of visual sub-structures (such as homogeneous, edgelike and junction-like patches) can be distinguished by the intrinsic dimensionality of the local signals. The concept of intrinsic dimensionality has been mostly exercised using discrete formulations; however, a recent work [14,7] introduced a continuous definition. The current study analyzes the distribution of local patches in natural images according to this continuous understanding of intrinsic dimensionality. This distribution reveals specific patterns that can be also associated to sub-structures established in computer vision and which can be related to orientation of signals and errors in optic flow estimation. In particular, we link quantitative and qualitative properties of opticflow estimation quality to these patterns. 1. Introduction Natural images are dominated by specific local sub-structures, such as homogeneous patches, edges, corners, or textures. Sub-domains of Computer Vision have extracted and analyzed such sub-structures in edge detection (see, e.g., [4]), junction classification (see, e.g., [20]) and texture interpretation (see, e.g., [25,19]). There exists evidence that also in human vision such sub-structures are processed (see, e.g., [10,23]) and that they play distinguishable roles in the information processing (see, e.g., [13]). The intrinsic dimension ([26,6]) has proven to be a suitable descriptor that distinguishes such sub-structures and assigns intrinsic dimension of zero (i0d), one (i1d) and two (i2d) to homogeneous patches, edge-like structures and junctions (and most textures), respectively. 1 Association of id to a local image structure has been done mostly by a discrete classification [26,6,11]. In the current paper, we will use the continuous definition outlined in [14,7] to investigate the structure of the distribution of signals in natural images according to their id. We argue that the continuous formulation of id allows for a more precise characterization of established sub-structures in terms of their statistical manifestation in natural images. 1 For simplicity, intrinsic dimension of zero, intrinsic dimension of one, intrinsic dimension of two and intrinsic dimensionality (and intrinsic dimension) will be called i0d, i1d, i2d and id, respectively, throughout the paper.
2 Sinan KALKAN et al. About optic flow estimation at different sub-structures, in general, it is acknowledged [17,2] that (A0) optic flow estimates at homogeneous image patches tend to be unreliable; (A1) optic flow at edge-like structures faces the aperture problem; (A2) only for i2d structures, true optic flow is computable. In this paper, we investigate these claims more closely for several optic flow algorithms (Nagel [18], Lucas- Kanade [16] and a phase-based approach [8]). By making use of a continuous understanding of id, these statements can be made quantitatively more specific in terms of (1) characterization of sub-areas for which they hold and (2) their strength. This paper is a summary of an earlier publication [12] where more detailed results can be found. The outline of the paper is as follows: In section 2, we introduce the continuous id used in our analysis. In section 3, we investigate the distribution of local image patches according to their id while in section 4 we look at the quality of optic flow estimation for different kinds of signals. 2. Intrinsic Dimensionality Looking at the spectral representation of a local image patch (figure 1a,b), we see that the energy of an i0d signal is concentrated at the origin; the energy of an i1d signal is concentrated along a line while the energy of an i2d signal varies in more than one dimension. In image processing, the id was introduced by [26] to formalize a discrete distinction between edge-like and junction-like structures. Recently, it has been shown [14,7] that the topological structure of the id must be understood as a triangle that is spanned by two measures: origin variance (deviation from a point at the origin) and lie variance (deviation from a line) (figure 1b,c). The corners of the triangle then represent the 'pure' cases of id, and the surface of the triangle corresponds to signals that carry aspects of the three 'pure' cases and the distance (a) (b) (c) Fig. 1. Illustration of id (basically taken from [7]). (a) Three image patches for three different ids. (b) The spectrum of the local patches in (a), from top to bottom: i0d, i1d, i2d. (c) The topology of id. from the specific corners indicates the similarity to pure i0d, i1d and i2d signals. This is achieved through the 3 confidences assigned to each discrete case (for details see [5]). These three confidences reflect the volume of the 3 areas constituted by the points (figure 1c).
Local Image Structures and Optic Flow Estimation 3 3. Statistics of id We used a set of 7 (outdoor) natural sequences with 10 images each with 1276x1016 resolution. Origin and line variances are computed for each image patch and are plotted on the id triangle (figure 2). The distribution shows two main clusters. The peak close to the origin has low origin variance, corresponding to nearly homogeneous image patches. Since the orientation is almost random for such homogeneous image patches, it causes high line variance. There is also a small peak at the origin that corresponds to saturated/black Fig. 2. Logarithmic plot of id distribution. image patches. The other cluster is for high origin variance signals with low line variance (i.e., edges). Finally, there is a smooth descend while approaching to the i2d area of the triangle, meaning that there does not exist a cluster for corner-like structures. The distribution is shown in a schematic representation in figure 4-top. 4. Analysis of Optic Flow Estimation Quality We now analyze the errors of the optic flow estimation depending on id of the signals. For this, we need to compare the computed flow with a ground truth. For our investigations, we used the Brown Range Image Database (BRID) (see [9]). Using the 3D data, a moving camera is simulated and the corresponding flow field is retrieved. The error of optic flow estimation is displayed in a histogram over the id triangle (see figure 3) using the well-known measure: u. v + 1 1 e( u, v) = cos ( ) ( u. u + 1) ( v. v + 1) For Nagel with 10 iterations, the error (figure 3a) is high for signals close to the i0d corner of the triangle as well as on the horizontal stripe from the i0d corner to the i1d corner. In the other parts, there is a smooth surface descending towards the i2d corner. This is in accordance with the notion that optic flow estimation at corner-like structures is more reliable than other signals (A2). However, in figure 3a, it becomes obvious that the area where more reliable flow vectors can be computed is very broad and covers also i0d and i1d signals. Furthermore, the decrease of error is rather slight which points to the fact that the quality of flow estimation is quite limited in these areas, as well. When the number of iterations is increased 2, the estimation of flow becomes better almost for the whole triangle except for some parts in i1d area (figure 3b). Among all, the phase-based approach [8] produces the best results for quite a 2 Nagel algorithm can be understood as a diffusion process [1]. Therefore the number of iterations determines the size of the area used in the computation of the flow (the more iterations, the larger the size of this area).
4 Sinan KALKAN et al. large area (figure 3f). However, for many i1d signals the estimate is more unreliable than for most i0d and i2d cases. For Lucas-Kanade, we again observe that the area where error is low smoothly extends to some i0d and i1d signals (figure 3e). The figure 3 suggests that this only slight decrease is a general property of the investigated optic flow algorithms. (a) Nagel with 10 iterations. (b) Nagel with 100 iterations. (c) Nagel 10 iterations with normal ground truth. (c) Nagel 100 iterations with normal ground truth. (e) Lucas-Kanade. (f) The phase-based approach. Fig. 3. Schematic representation of the distribution of signals (top) and the quality of optic flow estimation at different signals (bottom). Fig. 4. Qualities of the optic flow algorithms depending on id. (a) Nagel with 10 iterations. (b) Nagel with 100 iterations. (c) Nagel with 10 iterations using normal ground truth. (d) Nagel with 100 iterations using normal ground truth. (e) Lucas-Kanade. (f) The phase-based approach. We have also computed error with the normal ground truth (figure 3c,d). For Nagel with 10 iterations, the error is very low for a horizontal stripe from the middle point between the i0d and i1d corners to the i1d corner. When compared to figure 3a, this figure reflects the effect of the aperture problem when only local information is used. When the number of iterations is increased, we see that using more global information decreases the effect of the aperture problem (figures 3c,d). However, it is visible that
Local Image Structures and Optic Flow Estimation 5 the quality of the estimated error area in the i1d area of the triangle is always significantly lower than the quality of the estimated normal flow with a small number of iterations (i.e., using only very local information). Comparing the results for Nagel with 10 and 100 iterations, we observe that increasing the region of influence means an increase in the overall quality of optic flow estimation. However, it is visible that error in estimation of flow in the i1d area of the triangle using more global information (figure 3b) is always significantly higher than the quality of the estimated normal flow using local information (figure 3a), which can be computed with high reliability (figure 3c). The information for such signals is already of great importance since (1) there exists a large number of local image patches corresponding to such edge-like structures (figure 2, 4-top) and (2) constraints for global motion estimation can be defined on line-correspondences (see, e.g., [21,15]), i.e., correspondences that only require normal flow. 5. Discussions A continuous understanding of id allows for (1) a more precise characterization of established sub-structures in terms of their statistical manifestation in natural images (figure 2 and 4-top) and (2) for a more quantitative investigation and characterization of the quality of optic flow estimation. We could justify and more precisely quantify generally acknowledged assumptions about such estimates (see A0-A2 in the introduction). More specifically, we showed (figure 3, 4-bottom): In general, homogeneous structures lead to low quality optic flow estimation. However, using different optic flow algorithms we showed that for most i0d signals, true flow could be estimated with good accuracy. The optic flow estimates in the stripe-shaped cluster corresponding to edges is in general worse than for i2d signals for all investigated algorithms. However, for the Nagel algorithm with few iterations, the normal flow can be estimated reliably for this cluster. We think that this reliably computed normal flow is important information as such. For example, line-line correspondences that can be derived from the normal flow play an important role in RBM estimation (see, e.g., [24,22,15]). We also showed the effect of using more global information for flow estimation of i1d signals. The quality of optic flow estimation is higher for i2d signals. However, in analogy to the lack of a cluster for i2d signals, there exists a continuous signal domain (covering also sub-areas of i0d and i1d signals) for which this high quality can be achieved. The increase of the quality, however, is only slight. This suggests that the role of i2d structures for motion estimation might not be as important as suggested by some authors (see, e.g., [17]) or that processing of corner/like structure in humans is more complex. Using different optic flow algorithms, we show that this is a general property of optic flow algorithms. We want to stress that our aim was not to find the best algorithm but to make general properties of optic flow algorithms explicit. The choice of the best algorithm depends a lot on the context determined by time constraints or hardware constraints.
6 Sinan KALKAN et al. Furthermore, we could show that even different parameter settings leading to qualitatively different estimates that might be plausible for different tasks. References [1] L.Alvarez;J.Weickert;J.Sanchez. Reliable estimation of dense optical flow fields with large displacements. IJCV, 2000. [2] J.L.Barron;D.J.Fleet;S.S.Beauchemin. Performance of optical flow techniques. IJCV, 1994. [3] T.Brox;A.Bruhn;N.Papenberg;J.Weickert. High accuracy optical flow estimation based on a theory for warping. Proc. of ECCV 2004, 2004. [4] J.Canny. A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986. [5] H.S.M.Coxeter. Introduction to Geometry (2 nd ed.). Wiley & Sons, 1969. [6] M.Felsberg. Low-Level Image Processing with the Structure Multivector. PhD thesis, Uni. of Kiel, 2002. [7] M.Felsberg;N.Krueger. A probabilistic definition of intrinsic dimensionality for images. Pattern Recognition, 24 th DAGM Symposium, 2003. [8] T.Gautama;M.M.Van Hulle. A phase-based approach to the estimation of the optical flow field using spatial filtering. IEEE Transactions on Neural Networks, 2002. [9] J.Huang;A.B.Lee;D.Mumford. Statistics of range images. CVPR, 2000. [10] D.H.Hubel;T.N.Wiesel. Anatomical demonstration of columns in the monkey striate cortex. Nature, 1969. [11] B.Jahne. Digital Image Processing - Concepts, Algorithms, and Scientific Applications. Springer, 1997. [12] S.Kalkan;D.Calow;M.Felsberg;F.Worgotter;M.Lappe;N.Krüger. Optic flow statistics and intrinsic dimensionality. Brain Inspired Cognitive Systems, 2004. [13] J.J.Koenderink;A.J.Dorn. The shape of smooth objects and the way contours end. Perception, 1982. [14] N.Krüger;M.Felsberg. A continuous formulation of intrinsic dimension. Proc. of the BMVC, 2003. [15] N.Krüger;F.Wörgötter. Statistical and deterministic regularities: Utilisation of motion and grouping in biological and artificial visual systems. Advances in Imaging and Electron Physics, 2004. [16] B.Lucas;T.Kanade. An iterative image registration technique with an application to stereo vision. Proc. DARPA Image Understanding Workshop, 1981. [17] C.Mota;E.Barth. On the uniqueness of curvature features. Proc. in Artificial Intelligence, 2000. [18] H.-H.Nagel;W.Enkelmann. An investigation of smoothness constraints for the estimation of displacement vector fields from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1986. [19] E.Ribeiro;E.R.Hancock. Shape from periodic texture using the eigenvectors of local affine distortion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001. [20] K.Rohr. Recognizing corners by fitting parametric models. IJCV, 1992. [21] B.Rosenhahn. Pose Estimation Revisited (PhD Thesis). Uni. of Kiel, 2003. [22] B.Rosenhahn;G.Sommer. Adaptive pose estimation for different corresponding entities. Pattern Recognition, 24th DAGM Symposium, 2002. [23] I.A.Shevelev;N.A.Lazareva;A.S.Tikhomirov;G.A.Sharev. Sensitivity to cross-like figures in the cat striate neurons. Neuroscience, 1995. [24] F.Shevlin. Analysis of orientation problems using Plücker lines. Int. Conf. on Pattern Recognition, 1998. [25] N.Sochen;R.Kimmel;R.Malladi. A general framework for low level vision. IEEE Transactions on Image Processing, 1998. [26] C.Zetzsche;E.Barth. Fundamental limits of linear filters in the visual processing of two-dimensional signals. Vision Research, 1990.