Perceptual Organization

Size: px
Start display at page:

Download "Perceptual Organization"

Transcription

1 Perceptual Organization by Johan Wagemans Laboratory of Experimental Psychology Department of Brain & Cognition University of Leuven (KU Leuven) Leuven, Belgium Chapter for Volume 2. Sensation, Perception & Attention of The Stevens Handbook of Experimental Psychology and Cognitive Neuroscience, Fourth Edition Editor-in-Chief: John T. Wixted Volume Editors: Volume 1 (Learning & Memory): Liz Phelps & Lila Davachi Volume 2 (Sensation, Perception & Attention): John Serences Volume 3 (Language & Thought): Sharon L. Thompson-Schill Volume 4 (Developmental & Social Psychology): Simona Ghetti Volume 5 (Methodology): E. J. Wagenmakers Publisher: John Wiley & Sons, Inc. Main text = 29,289 words, 289 references, 14 figures and 6 footnotes Current version dated 13/12/2016 Cite as: Wagemans, J. (in press). Perceptual Organization. In J. T. Wixted (Series Ed.) & J. Serences (Vol. Ed.), The Stevens Handbook of Experimental Psychology and Cognitive Neuroscience: Vol. 2. Sensation, Perception & Attention. Hoboken, NJ: John Wiley & Sons, Inc. 1

2 Perceptual Organization Abstract Perceptual organization is the process of giving structure to our experience of objects, scenes, and events in the world. The literature on this complex and fascinating topic is huge, dealing with a wide range of phenomena, processes, neural mechanisms, models, and theories. The present overview starts from the seminal work by the Gestalt psychologists, covering the classic laws of perceptual grouping and figure-ground organization. Then follows a review of the main lines of psychophysical and neural research carried out on these and related topics in the last fifty years. The final section covers new directions of research. They refine the conceptual framework and sketch theoretical approaches that could offer interesting alternatives to the mainstream view of information processing in the cortical hierarchy. There is hope for a synthesis to emerge from this scattered field of research, leading to a better understanding of perceptual organization. Key Terms contextual modulations; cortical hierarchy; dynamics; figure-ground organization; Gestalt theory; part-whole relationships; perceptual grouping; phenomenology; Prägnanz (goodness); Structural Information Theory Acknowledgments I have been lucky enough to have received long-term structural funding from the Flemish Government for my GestaltReVision program a research program aimed at reintegrating the Gestalt approach into contemporary approaches to vision, brain, and art (METH/08/02 for and METH/14/02 for ). I would like to thank my collaborators who provided feedback on a previous draft of this chapter: Charlotte Boeykens, Vebjörn Ekroll, Kathleen Vancleef, Sander Van de Cruys, Peter van der Helm, Andrea van Doorn, Raymond van Ee, Cees van Leeuwen, and Steven Vanmarcke. I am also grateful for the administrative and technical support by Agna Marien and Rudy Dekeerschieter. 2

3 Introduction I stand at the window and see a house, trees, sky. For theoretical purposes, I could now try to count and say: There are ( ) 327 brightnesses (and color tones). Do I have 327? No; I have sky, house, trees. Having the 327 as such is something no one can actually do. If ( ) there happen to be 120 shades of brightness in the house and 90 in the trees and 117 in the sky, then at any rate I have that grouping, that segregation, and not, say, 127 and 100 and 100; nor 150 and 177. I see it in this particular grouping, this particular segregation; and what nature of grouping and segregation I see is not simply a matter of my whim. I can by no means just get any other nature of coherency I like at will. (Wertheimer, 1923/2012, p. 301/p. 127) What did Max Wertheimer one of the founding fathers of Gestalt psychology mean when he wrote this? There are three key observations here. First, perceptual experience is organized in a particular way. It consists of objects (e.g., house, trees) and background (e.g., sky). Our experience does not consist of the brightness and color values that might be taken as the raw data constituting our sensations. Second, the organization depends on a particular grouping, which seems to go hand in hand with a segregation from the rest (i.e., the grouped values and perceived regions are set apart from others). Third, this perceptual organization appears definite and lawful, not arbitrary. Moreover, the lawfulness does not result from an act of will. Each of these three key points refers to a deep insight in perceptual organization and thus deserves to be unpacked here, as an introduction to this review chapter on perceptual organization. The first point is a really fundamental one because it was at the heart of the Gestalt revolution in the 1910s and 1920s. The Gestalt psychologists of the Berlin school (with Max Wertheimer, Wolfgang Köhler and Kurt Koffka as the most famous representatives) argued for the primacy of objects as units of experience, instead of sensations, which were the building blocks of mental life for the proponents of structuralism (mainly Wilhelm Wundt and Edward Titchener) the dominant school of thought in psychology at that time. In addition, Gestalt psychologists also defended a particular epistemological and methodological approach to understanding perception, namely von oben nach unten (top-down, i.e. starting from organized percepts and experienced objects), instead of von unten nach oben (bottom-up, i.e. starting from so-called raw sensations a simple mapping of physically specified stimulus properties). As discussed further below, subsequent research on perceptual organization has abandoned this central proposition of Gestalt psychology 1. The second point establishes a close link between perceptual grouping and figure-ground organization two major processes of perceptual organization. What is grouped together becomes the object of experience, and by that very same process also becomes segregated from the rest. Wertheimer used the nouns Zusammengehörigkeit ( togetherness ) and Getrenntheit ( separation ) to describe these two sides of the same coin. This description also highlights how intimately connected grouping and segregation are. When elements are grouped, they often form 1 On the contrary, mainstream vision science and visual neuroscience generally attempt to understand perceptual grouping and figure-ground organization as starting from the retinal mosaic of stimulation, and gradually extracting more structure and meaning. I will demonstrate at the end of this chapter that it does not have to be this way. An alternative view is possible, which is more akin to the Gestalt spirit but does justice to what is known about the visual cortex. 3

4 some kind of object (or proto-object, in case it is not fully developed or semantically categorized), which stands apart from the rest, much like a figure against a background. In principle, one can have grouping into a larger unit and segregation between such higher-order units (like a house and trees against the sky) without necessarily having a segmentation of a figure against a background that runs behind it (with the figure owning the borderline between the regions and some kind of filling-in behind it), let alone a specific separation in depth. However, in most situations with multiple regions with partial occlusion (e.g., a house behind a tree, trees that cover some bits of the ground surface and some bits of the sky), one often has this additional level of organization. So, this part of the quote establishes a tight coupling between grouping and (texture) segregation, as well as between grouping and figure-ground organization (or segmentation), which is one of the reasons why the laws of grouping and the laws of figure-ground organization are often treated together as laws of perceptual organization (see below). However, conceptually, it makes a lot of sense to consider these grouping and segregation processes separately, and to maintain a distinction between mere segregation between two regions and a slightly more involved figure-ground organization. In this way, one could uncover different factors playing a role in each of these processes or reveal distinct neural mechanisms underlying each of these processes (see further discussion below). The third and final point emphasizes the lawful and mandatory character of the organization into grouped and segregated larger units of experience. The quote at the start of this chapter originally formed the introduction to Wertheimer s Investigations on Gestalt principles, in which he outlined the factors that determine whether an arrangement of three or more elements, say A, B, and C, is generally organized as A versus B-C or as A-B versus C. We will discuss these principles sometimes described as factors or more ambitiously as laws in more detail below but at this point, it is important to underline that the organization is driven by intrinsic forces within the perceptual system that are not under voluntary control. In later works (especially by Köhler), the Gestalt theorists have discussed these principles as reflecting field forces, equally in experience and in the brain, establishing some kind of isomorphism between them. Although this speculative theoretical layer has been stripped off from these principles over the many decades since they were discovered, the principles themselves are still included in most textbooks on perception and they are probably the best-known heritage of the Gestalt tradition. Wertheimer s Investigations of Gestalt Principles The classic principles of perceptual grouping In his landmark paper, Wertheimer (1923/2012) not only discussed the general role of grouping in the organization of our perceptual experience of the world. He also investigated the principles underlying perceptual grouping and segmentation, which are now known as the classic principles of perceptual grouping. In this paper, Wertheimer illustrated the most essential factors in a series of simple, well-chosen examples but he also reported schematic overviews of experimental designs to identify and isolate the effective factors in a variety of stimulus arrangements. Some of these pertain to simple, linear dot configurations, others to arrays of dots, pairs of line segments, geometric patterns, surfaces, shapes, and even handwritten letters. The paper is much richer in contents and much deeper in theoretical insights than one may assume if one has not read it. Because of its foundational character and its continued importance for current research on perceptual organization, I will discuss it more extensively than is usually done. 4

5 A row of dots at equal distance from one another, is just perceived as a row of dots, without any particular grouping or segmentation (see Figure 1A). When some of the inter-dot distances are increased significantly relative to the others, one immediately perceives a grouping of some dots in pairs, which become segregated from others (see Figure 1B). Apparently, elements that are relatively closer together become grouped together, while elements that are relatively further apart are segregated from one another, based on the principle of grouping by proximity. When dots are differentiated from one another by size, color or another feature, the dots become spontaneously grouped again, even with equal distances. In Figure 1C, for instance, the smaller filled dots are grouped in pairs and so are the larger open ones. Apparently, elements that are similar to one another are grouped together, while dissimilar elements are segregated from one another, based on the principle of grouping by similarity. Of course, with unequal distances and differentiated attributes, the two principles can cooperate or compete. When both proximity and similarity are working together (see Figure 1D), grouping is enhanced compared to the conditions where only one principle can play a role (proximity in Figure 1B, similarity in Figure 1C). When they are competing (see Figure 1E), one might perceive pairs of dissimilar dots (when grouping by proximity wins) or pairs of similar dots at larger distances (when grouping by similarity wins), with these two tendencies possibly differing in strength between individuals and switching over time within a single individual. Even with equal distances, elements become grouped again, when some undergo a particular change together (e.g., an upward movement), while others do not change or change differently (e.g., move downward; see Figure 1F). This principle is grouping by common fate. Figure 1. Perceptual grouping in simple rows of dots (adapted from Wertheimer, 1923/2012). (A) No grouping, because there are no differences in distances between the dots and all dots have the same features. (B) As soon as the inter-dot distances are no longer the same, grouping by proximity occurs: the closer dots group together in pairs. (C) As soon as the dots are no longer the same, grouping by similarity occurs: similar dots group together in pairs. With unequal distances and features (e.g., size, color), grouping by proximity and grouping by similarity can facilitate one another (D) or compete against each other (E). (F) When subsets of dots undergo the same change together (e.g., start to move in the same direction and speed together), they group together because of their common fate. 5

6 With slightly more complicated arrangements, additional factors come into play. In the left arrangement of Figure 2, the dots in the horizontal group A are closer to the dots in the vertical group B than to the dots in the horizontal group C. However, perceptually the dots in A are grouped with the dots in C as a longer horizontal linear arrangement, with the dots in B being perceived as a vertical arrangement standing out from it in the middle (hence, A-C versus B). Similarly, in the middle arrangement of Figure 2, the dots in group C are all closer together to the dots in group B, yet they are perceived to be grouped with the dots in group A (hence, A-C versus B). Finally, with a similar arrangement of groups B and C, the perceptual grouping changes in the arrangement at the right, when group A is now aligned with group B instead of C (hence, A-B versus C), with C being perceived as horizontal segment sticking out from a diagonal line. In sum, what seems to determine the perceptual grouping here is not just the set of all pairwise inter-dot distances but their relative configuration or arrangement. Specifically, the alignment of the groups or good continuation of the linear arrangements of the groups of dots appears to matter. This principle of good continuation also governs the grouping of curved line segments. In Figure 3, for instance, one perceives a continuous line from A to C and from B to D, not A-B and C-D, nor A-D and B-C (which would be favored based on symmetry). Figure 2. The principle of good continuation in arrangements of dots (adapted from Wertheimer, 1923/2012). Left: A-C versus B. Middle: A-C versus B. Right: A-B versus C. Figure 3. The principle of good continuation in arrangements of curved line segments (adapted from Wertheimer, 1923/2012): A-C versus B-D. 6

7 As soon as line segments form closed patterns or shapes, another principle can be shown to have an effect (see Figure 4). With a similar X-junction as in Figure 3, A is no longer grouped with C, and B with D (A-C versus B-D), based on good continuation from curve A to C, and from curve B to D, but one now perceives a closed shape with curves A and B being grouped together into one closed shape and curves C and D into another one (A-B versus C-D). Hence, the principle of closed form is a separate one from good continuation. With form, other factors become important as well. In the left arrangement of Figure 5, one perceives two identical six-sided shapes, in different orientations and slightly overlapping, with line patterns a and c grouped together as forming one figure, and b and d as another (a-c and b-d). In the right arrangement of Figure 5, with the same line patterns in different positions and relative orientations, one clearly perceives something different, namely one elongated six-sided shape with a smaller diamond in the middle; so, now patterns a and d are grouped as one form, and b and c as another (a-d and b-c). Here, the principle of a good Gestalt is at stake: When parts form larger wholes, the wholes with a higher degree of regularity are better Gestalts and they tend to dominate our perception. In this context, symmetry, parallelism, and inner equilibrium are mentioned as factors contributing to the goodness of the whole. Figure 4. The principle of closed form (adapted from Wertheimer, 1923/2012): A-B versus C-D. Figure 5. The principle of good Gestalt (adapted from Wertheimer, 1923/2012): Left: line patterns a-c form one figure and line patterns b-d another one (two slightly overlapping copies of a similar six-sided shape). Right: line patterns a-d form one larger figure and line patterns b-c a smaller one (one elongated six-sided shape with a small diamond inside). 7

8 In Figure 6, something else happens when good Gestalts are perceived. When presented together, line segments a to f form a good six-sided shape with closure and symmetry (see Figure 6, left). When presented in a larger arrangement (see Figure 6, center), the same line segments group differently, with the upper line segments a to c now being part of a larger trapezoid, and the lower line segments d to f now being part of a smaller octagon, meaning that these shapes are better Gestalts than the original six-sided shape a-f. In another example of a larger arrangement (see Figure 6, right), line segments a, b and f are now seen to belong to a large parallelogram, while line segments c, d and e are now seen to belong to an elongated quadrilateral, with its long sides joined by a pointed angle sticking out from the parallelogram. This set of examples illustrates that the same line segments can be grouped differently, depending on the context, which usually allows for different larger shapes with different levels of goodness. As a result, certain line segments can be part of a stronger whole and may no longer be available as part of a weaker whole, which lies at the heart of so-called embedded figures (Gottschaldt, 1926; Witkin, 1950). Figure 6. Context effects demonstrating that the same line segments can group differently depending on how their context gives rise to different configurations (Gestalts) with variable levels of perceptual goodness (adapted from Wertheimer, 1923/2012): Left: Line segments a to f group into a nice six-sided shape. Middle: Line segments a to c become part of a larger trapezoid, while line segments d to f become part of a smaller octagon. Right: Line segments a, b and f become part of a larger parallelogram, while line segments c, d and e become part of an elongated quadrilateral. Many contemporary discussions of these classic principles of perceptual grouping are limited to the above. However, Wertheimer also discussed three additional factors, which do not fit the simplified image of Gestalt psychology one sometimes finds. First, Wertheimer sketched a series of parametric experiments, in which one of the above parameters with a clear impact on grouping (e.g., pairwise distances, relative angles between triplets of dots or short line fragments) was manipulated systematically, in equal steps, and he pointed out that these would not correspond to equal steps in the corresponding perceptual experiences. Instead, some of these possible perceptual organizations act as categorical or prototypical percepts, while others are perceived as weaker or somewhat distorted versions of these special percepts. Wertheimer referred to this effect as the tendency towards the salient ( prägnant ) shape. The Prägnanz principle is a difficult one, with some deep theoretical connotations (also discussed below). Second, if one presents the parametrically different conditions as sequential trials in a single experiment rather than as a series of separate experiments, one would observe another interesting 8

9 phenomenon. The change from one discrete percept to another would depend on the context of the preceding conditions: the transition points from percept A to B would be delayed if the preceding stimuli were all giving rise to percept A, and ambiguous conditions where A and B are equally strong based on the parametric stimulus differences would yield percepts that go along with the organizations that were prevailing in previous trials. Hence, in addition to isolated stimulus factors, the set of trials within which a stimulus is presented also plays a role. So, the effect of the larger spatial (simultaneous) context on the grouping of identical elements or smaller arrangements that was illustrated before (Figures 2-6) can now be generalized to temporal (sequential) context as well. This second additional factor is called the factor of set. The German word that Wertheimer used for this ( Einstellung ) is better, because it makes it more immediately clear that the perceiver s perceptual state (or mind set) plays a role, in addition to the stimulus factors as such. A third extension, in line with the second, is the role of past experience, not only in the immediately preceding trials but based on a perceiver s life-long familiarity with certain configurations. This principle of past experience asserts that an arrangement of three components A B C will be organized as A-B versus C when A-B is familiar and C is familiar, but B-C is not, or when A-B versus C is familiar but A versus B-C is not. To illustrate this principle, Wertheimer used some patterns which result in familiar letters and digits when combined. However, at the same time, Wertheimer also warned that one cannot reduce all the previous principles to cases of familiarity and that familiarity not always overrides other factors. To make this point, he also showed cases where highly familiar shapes like letters become difficult to perceive when embedded in arrangements that give rise to alternative configurations which are stronger, better Gestalts based on other principles (usually good continuation and symmetry). The limited role of familiarity compared to the strength of embedding based on the principles of good continuation and good Gestalt has subsequently been investigated more systematically by Gottschaldt (1926). Yet, this third additional principle, when understood correctly, implies that past experience does play a role in perceptual organization, albeit a limited one. Wertheimer pointed out that this is just one of several factors. When different factors come into play, it is not easy to predict which of the possible organizations will have the strongest overall Gestalt qualities (the highest goodness ). That does not mean, however, that everything is merely arbitrary and subjective. On the contrary, even the set effects described above can be induced experimentally and measured objectively. Wertheimer frequently emphasized the question of objectively regular tendencies (e.g., Wertheimer, 1923/2012, p. 333/p. 163) and he argued: Systematic experimentation ( ) shows that the configurations are sensitive in a very characteristic way: always in conformity with the strength and salience (Prägnanz) of the pertinent objective factors (Wertheimer, 1923/2012, p. 335/p. 165). Wertheimer, therefore, presented his work more as the start of a systematic research program than as a set of definitive conclusions from a finished series of experiments. From specific principles of perceptual grouping to a general Gestalt theory Together with Wertheimer s 1912 paper on phi motion (i.e., a special kind of apparent motion), Wertheimer s 1923 paper on grouping principles has been the empirical foundation for a more general discussion of some deep insights, which could be considered as the key components of 9

10 Gestalt theory 2. What are these central ideas? A general theme, recurring in many of the situations described above, is that the particular organization depends on the specific arrangement of the elements (e.g., dots, line segments, groups of dots and lines): The same elements can be grouped differently, depending on the context. What determines the resulting organization is the goodness of the overall configuration or Gestalt. In Wertheimer s (1923/2012, pp /p. 177) own words: Everything seems to point to this: We are not dealing here with a principle that makes its appeal primarily to distances and relationships between the individual pieces. Rather, it is primarily a matter of the resulting of whole forms and of articulation into sub-wholes. It works not from the bottom up, not from the individual pieces step by step to higher forms, but the other way around. Just to be clear, with the other way around, Wertheimer means from the whole forms to the individual pieces. This is the so-called primacy of the whole, which is probably the most central tenet of Gestalt theory. Again, this is a difficult idea that needs to be unpacked. For the Gestalt notion of the primacy of the whole at least three different readings exist. First, it could be a mere description of an important general observation, namely that what seems to matter is the larger context within which the to-be-grouped elements are embedded, and more specifically, the goodness of the overall Gestalt resulting from the particular configuration. This is definitely a valid statement that can hardly be contested because it is basically theory-free, except for the use of concepts such as configuration, Gestalt and goodness, which need to be specified further. The second reading is a methodological one: To understand (or predict) how people (will) organize a particular arrangement, one needs to examine the properties of the resulting configurations. How exactly one needs to do this in an objective, principled way rather than just relying on one s own perceptual experience 3, is not immediately obvious. But what is clear is that the perceived organization cannot be predicted simply from the elements themselves, their relative distances and orientations, and their combinations. The resulting Gestalt is not simply an aggregation or concatenation of these primitive components. In other words, Gestalts are not and-summations or Und-Summe as Wertheimer called them. The third reading is the strongest, and theoretically most contested one: The level of Gestalts is what enables our experience. The perceived organization cannot merely be explained in terms of mechanisms operating on primitive units residing in lower levels of the system. In neural terms: Gestalt processes do not work from neural activities that code for a fixed set of attributes. In particular, if we consider perceptual processing to involve early detection of local features in small receptive fields, further processing of these features will not be able to explain the Gestalt. What the alternative holistic explanations could be has always remained a bit mysterious. In the initial years of the Gestalt revolution, it was described somewhat fuzzily as whole-processes with their whole-properties and laws, characteristic whole-tendencies and whole-determinations of parts (Wertheimer, 1922/1938, p. 14). Köhler (1920/1938) attempted to give this notion a physical 2 In this context, it is useful to know that the work described in the 1923 paper was actually carried out in the period , when Wertheimer, Köhler and Koffka were working together in Frankfurt, and actively building their theoretical framework, in opposition to the mainstream structuralist and empiricist tradition (the so-called Gestalt revolution ; see Wagemans, 2015a). 3 Relying on one s own experience to know from the first-person perspective what it is like to perceive something like this is phenomenologically alright, but scientifically (from the objective, third-person perspective) this leads to a circular logic: The resulting Gestalt is better because it is the one that is perceived. Sure, but how can we predict which of the possible configurations has the highest goodness and will therefore be perceived? This is a fundamental problem for the Gestaltist research program. 10

11 grounding by relating it to physical Gestalt processes as described by Maxwell and Planck. He conceived of Gestalts as resulting from integrated processes in the entire optical sector, including retina, optical tract and cortical areas, as well as transverse functional connections among conducting nerve fibers (i.e., what we would describe as feedback connections or recurrent neural networks nowadays). Specifically, Köhler proposed an electrical field theory, in which the lines of flow are free to follow different paths within the homogeneous conducting system, and the place where a given line of flow will end in the central field is determined in every case by the conditions in the system as a whole (Köhler, 1920/1938, p. 50). Empirically, this theory was claimed to be refuted by famous experiments by Lashley et al. (1951) and Sperry et al. (1955), but this has been contested by Köhler (1965) himself. Theoretically, the notion of the visual system as a nonlinear self-organizing system is still very much alive today (see Wagemans, 2014; see also below). Leaving aside these descriptive, methodological and theoretical interpretations of the primacy of the whole, the central overarching principle has also been labeled the Prägnanz or simplicity principle. As an extension to the specific tendency towards salient or prägnant shapes described above, the general Prägnanz principle states that the perceptual field and objects within it will take on the simplest and most outstanding ( ausgezeichnet ) structure permitted by the given conditions 4. In a little-known study, Wohlfahrt (1925/1932) showed that the presentation of small, low-contrast, blurred and peripheral stimuli were generally perceived as more regular and perceptually better balanced than the stimuli by which they were elicited another source of evidence for the tendency towards better Gestalts. For Köhler (1920/1938), the Prägnanz or simplicity or minimum principle of visual Gestalts was just another case of the universal physical law that all processes in physical systems, left to themselves, show a tendency to achieve the maximal level of stability (homogeneity, simplicity, symmetry) with the minimum expenditure of energy allowed by the prevailing conditions. A corollary of the primacy of the whole is the so-called ceteris paribus principle. When formulating the grouping principles, Wertheimer and his associates were always careful to add statements like provided that all other things are equal. So, a rule or principle can only be regarded to hold within the constraints of the given conditions, and not necessarily in other conditions when other (stronger) principles could come into play. Because a larger context could always change the conditions for possibly other higher-order Gestalts with higher goodness to emerge, one needs this ceteris paribus principle to delineate the boundary conditions regarding stimulus and perceiver within which the particular principle manifests itself. This raises the following question: How universal are the grouping principles? Implicit in Gestalt theory is the idea that the simplicity or minimum principle is indeed universal in the sense of being the overarching principle, under which all other principles can be subsumed as special cases, and in the sense of being a universal law, applying to all cases of perceptual organization for all observers. As a consequence, the strongest principles are pretty robust, not in the sense of being the same in 4 The German word Prägnanz is derived from the verb prägen to mint a coin. Hence, by describing the principle of Prägnanz, a connection is made to the notion of Gestalt as the appearance, the characteristic shape of a person or object, or the likeness of a depiction to the original (which was the colloquial German meaning of Gestalt before it received its more technical meaning as we know it today). For this reason, Prägnanz has often been translated as goodness. This connection was stronger in Wertheimer s term ausgezeichnet (distinctive, outstanding, remarkable) than in the description of Gestalts as being as regular, simple, and symmetric as possible given the conditions. 11

12 different contexts (see above) but in the sense of being stable under different viewing conditions. According to Wertheimer (1923/2012, p. 339/p. 170), it is generally unnecessary to take conditions of fixation point, eye movements, placement of attention, or distribution of attention into special account. On the other hand, it is quite striking that Wertheimer often stated that the observations he was making could be verified easily by people with a very strongly visual disposition, as if he was suggesting that the observations held only for people with good visual skills or perhaps considerable experience as observers in phenomenological experiments. If robust and universal, where do the grouping principles come from? Here, Wertheimer (1923/2012, pp /p. 167) referred to the role of experience in our particular biological environment: The nervous system has developed under the conditions of the biological environment. It is no wonder that the Gestalt tendencies developed thereby correspond with the regular conditions of the environment. In contrast to what is sometimes argued, the Gestaltists did not deny the connection between the autonomous, intrinsic organization principles and the environment, which is not to say that the simplicity principle can be reduced to familiarity (see above) or likelihood (see below). Before discussing new findings and ideas regarding the Gestalt principles, I need to make two remarks about issues that may have escaped notice. First, when describing perceptual organization, Wertheimer (1923/2012, pp /p. 177) referred to whole forms and of articulation into subwholes. This introduces the idea of hierarchical Gestalts, with integrated wholes ( Ganze ) and subwholes ( Teilganze, Unterganze ). This idea also emphasizes that a part is also not a given piece of the stimulus (from which the overall organization is created) but a particular component of the organized percept. The exact nature of the relationships between parts and wholes is a complicated one (see below), but it is clear that parts result from an organization and not the other way around. Second, it is not so clear perhaps whether the Gestalt principles are grouping principles or principles of figure-ground organization or both. The reason is that Wertheimer started off his paper with simple linear arrangements of dots, which clearly entail grouping, and ended with complicated configurations of abutting, overlapping or nested shapes, which already seem to imply figure-ground organization. For now, we could say that proximity, similarity, common fate, and good continuation are grouping principles, while symmetry, parallelism, closure, convexity and other characteristics of good forms, are principles determining figure-ground organization. The remaining ones Prägnanz, set, and experience are general principles applying to all forms of perceptual organization. Below, we will develop a somewhat richer account of different types of Gestalts. A Century of Research on the Gestalt Principles Wertheimer s (1923/2012) paper is truly foundational in providing the empirical, methodological, and conceptual basis for a research program on perceptual grouping and figure-ground organization, and in sketching the outlines of Gestalt theory to explain the phenomena discovered in this program (see also Vezzani, Marino, & Giora, 2012). In this chapter, I cannot do justice to the complete literature on perceptual organization, which is quite extensive. In 2012, we summarized a century of empirical research on perceptual grouping and figure-ground organization (Wagemans et al., 2012a), as well as theoretical discussions and progress (Wagemans et al., 2012b). Each of these papers is already longer than this chapter. In 2015, The Oxford Handbook of Perceptual Organization was published (Wagemans, 2015b) a volume of more than 1,000 pages, with more than 50 state-of-the art reviews of various topics in this domain, divided in 10 sections, some more theoretical and 12

13 general, and others on specific phenomena of perceptual organization. It speaks for itself, therefore, that a detailed summary of this huge literature of the size of a single chapter would be impossible. The present chapter aims for a complementary contribution, with some self-contained bold summaries of some core topics of perceptual organization and some new discussions of theoretical insights and ideas that often remain implicit in the empirical literature. My hope is that such a contribution will be helpful both for novices in the field, who want to know more about perceptual organization than what they find in introductory textbooks on sensation and perception, and for more experienced scholars, who want to gain a deeper understanding of the phenomena or look for inspiration for new lines of research. In this section, I will summarize some important empirical, methodological and theoretical advances in our understanding of the classic principles of grouping and figure-ground organization, and some new principles discovered in the last two decades or so. In the next section then, I will try to point to some new directions for research on perceptual organization. Advances on the classic principles of grouping and figure-ground organization For a long time, the Gestalt principles had a bad reputation. Palmer (2003, p. 19), for instance, classified them among the best known, yet least understood, phenomena of visual perception. There are probably two main reasons for this (Spillmann, 2012). First, although the Gestalt principles are intuitively convincing (everyone can see this for themselves), they are not easily defined and even less easily quantified (see also Jäkel, Singh, Wichmann, & Herzog, 2016). Second, the neurophysiological mechanisms underlying these principles have been unknown for a very long time. Important progress has been made on both of these fronts. In addition, the ecological basis of the principles has been substantiated by statistical analysis of natural images, which provides a nice bridge between the quantitative and neurophysiological strands of research. Measurement and quantification: 1. Direct and indirect methods Traditional research on perceptual grouping and figure-ground organization has mostly relied on experimental phenomenology as the preferred method (see Albertazzi, 2013; Koenderink, 2015a). This method consists simply of presenting observers with a stimulus (or multiple well-controlled versions of it) and asking them to report what they see (in all the corresponding conditions). It is an extension of the method of demonstration ( compelling visual proof ) that Wertheimer and his colleagues used, and it relies on shared subjectivity as an approach to try to achieve objectivity regarding visual appearance, which is essentially a first-person, subjective entity. Modern research on perceptual organization has supplemented this basic approach with a series of more objective, indirect methods that do not ask observers for direct reports of their subjective experience but give them a task in which performance and reaction time can be measured. One example of this is the Repetition Discrimination Task or RDT (Palmer & Beck, 2007), which has been used to measure the effects of grouping. For example, participants are presented with a row of circles and squares, which are alternated except for one pair in which the same element is repeated. Participants have to detect the repetition as quickly as possible and report the shape of the repeated element. Perceptual grouping within the row of elements was made possible by manipulating proximity or similarity between the elements. It turns out that participants are faster at this when the repeat occurs within 13

14 the same group than when it appears between two different groups. Because task performance is modulated by grouping, this task can be used to quantify grouping effects indirectly. This method has helped to corroborate the findings obtained with direct subjective report tasks and it has provided good empirical support for several new principles of grouping (see below). Other performance-based methods that are used in the context of perceptual organization are matching, priming, cueing, primed matching, search, speeded classification, and so forth. Clever applications of these methods can be found in the research programs by Pomerantz and colleagues on emergent features and the configural superiority effect, using the odd-quadrant task (a particular kind of search task, see below), Garner interference and Stroop interference (for a recent review, see Pomerantz & Cragin, 2015) and by Kimchi and colleagues on global precedence, hierarchical structure and holistic properties, using priming, visual search, primed matching, and speeded classification (for a recent review, see Kimchi, 2015). Measurement and quantification: 2. Grouping by proximity and similarity Although these indirect methods have important advantages, this does not mean that one cannot achieve quantification when starting from direct reports. This can be illustrated clearly with research on grouping by proximity, and its interaction with similarity. Wertheimer (1923/2012) convincingly demonstrated the role of proximity in grouping but he did not provide a quantitative account of its influence. In an early attempt to do so, Oyama (1961) used simple, rectangular 4 4 dot lattices in which the distance was constant along one dimension but varied (across trials) along the other dimension. During an observation period of 2 min, participants continuously reported whether they saw the lattice organized in rows or columns at any given time (by holding down one of two buttons). As the distance in one dimension changed relative to the other one, proximity grouping quickly favored the shortest dimension according to a power function. Essentially, when inter-dot distances along the two dimensions are similar, a small change in inter-dot distance along one dimension can strongly shift perceived grouping. However, the effect of such a change in inter-dot distance falls off as the initial difference in inter-dot distance along the two dimensions grows larger. This relationship, however, only captures the relative contributions of two of the many possible organizations within the lattice, and the square and rectangular lattices used by Oyama (1961) are only a subset of the space of all possible 2D lattices. Hence, the particular power-law relationship may not generalize beyond these cases. For this reason, Kubovy and Wagemans (1995) and Kubovy, Holcombe, and Wagemans (1998) generated a set of stimuli that spanned a large space of dot lattices by varying two basic features: the relative length of their shortest inter-dot distances, vectors a and b (i.e., b / a ) and the angle γ between them. They then presented these stimuli to participants for 300 ms and asked them to indicate the perceived orientation. The frequencies of the perceived orientations over a large number of trials could then be used as estimates of the probabilities of the different perceptual organizations, and the relative frequencies could then be plotted as a function of relative distance. Remarkably, all the values of the log-odds fell on the same line, called the attraction function, with the slope being a person-dependent measure of sensitivity to proximity. The fact that the attraction function in log-space is linear means that the (relative) strength of grouping decays as an exponential function of (relative) distance. Moreover, the fact that all data points (obtained with all pairs of distances and all relative orientations) could be fitted well by a single straight line indicates that grouping by proximity depends only on the relative distance between dots in competing organizations, not on the overall configuration in which the competition 14

15 occurs (i.e., the lattice type, each with its own symmetry properties). Hence, in the case of grouping by proximity, the whole (i.e., the perceived orientation) is not more than the sum of the parts (i.e., the relative distances). For this reason, this quantitative relationship was called the Pure Distance law. Once it has been established how grouping varies as a function of relative distance, it is possible to investigate what happens when grouping by proximity and grouping by similarity are concurrently applied to the same pattern. Are these two principles combined additively or not? Kubovy and van den Berg (2008) presented participants with rectangular lattices of dots of different contrasts. Dots with the same contrast were either arranged along the shorter axis of each rectangle of dots within the lattice (similarity and proximity in concert) or arranged along the longer axis (similarity and proximity pitted against each other). Dot lattices varied across two dimensions: the ratio between the short and long axis of each rectangle of dots within the lattice and the contrast difference between the different arrays of dots. As in the previous studies, each lattice was presented for 300 ms, and participants were asked to indicate which of the four orientations best matched the perceived arrangement of the dots in the lattice. Remarkably, the conjoined effects of proximity and similarity turned out to be additive another case of the whole (i.e., the combined grouping strength) not being more than the sum of the parts (i.e., the two separate grouping principles). Using lattices in which dots were replaced by Gabor elements, Claessens and Wagemans (2005) came to similar conclusions regarding proximity and collinearity. Other research using similar direct-report methods has allowed to quantify grouping by proximity in patterns with zigzagged (Claessens & Wagemans, 2008) and curved parallel lines (Strother & Kubovy, 2006, 2012), with some striking findings (power law instead of exponential law and preference for the most strongly curved lines instead of the least strongly curved lines, respectively). Likewise, such methods have allowed to measure the strength of spatial and temporal grouping in moving dot lattices (Gepshtein & Kubovy, 2000), leading to an important generalization of the previously obtained contradictory findings of space-time tradeoff versus coupling (Gepshtein & Kubovy, 2007; Gepshtein, Tyukin, & Kubovy, 2007; for further discussion of the computational principles and the deeper theoretical implications of this work, see Gepshtein, 2010 and Jurica, Gepshtein, Tyukin, & van Leeuwen, 2013). Measurement and quantification: 3. From natural image statistics to computational models Studies such as those reviewed above, in which grouping factors are isolated to quantify their strength, are useful but understanding their role in everyday perception requires a different approach. An important task of natural vision is to identify and group together the portions of the 2- D retinal image that project from an object. In the simple case in which the object boundary projects onto a single closed curve, the problem reduces to a problem of contour grouping or contour integration. Fifty years of computer vision research, however, has shown that this is a computationally difficult problem because of occlusion, clutter, and other sources of image degradation, which imply that for any given contour fragment multiple other fragments could be the correct continuation of the contour (Elder, Krupnik, & Johnston, 2003). Although computer vision algorithms for image grouping are much worse than what human perceivers seem to be capable of, computational work on the statistics of natural images has been important to provide quantitative support for the ecological basis of the grouping principles. 15

16 For instance, the principle of proximity states that the strength of grouping between two elements decreases as these elements are separated further, but there has been some debate on the exact shape of the function relating grouping strength to distance. As reviewed above, Oyama (1961) found that this relationship could be described as a power law, whereas Kubovy and Wagemans (1995) employed an exponential model. However, Kubovy et al. (1998) also noted that a power law model could fit their data equally well and found that proximity grouping was approximately scale-invariant: Scaling all distances by the same factor did not affect results. Since the power law is the only perfectly scale-invariant distribution, this result provides further support to the power-law model of proximity, which has been used in subsequent studies (e.g., Claessens & Wagemans, 2008). Perceptual scale invariance is a reasonable choice if the proximity of elements along real contours in natural images is also scale invariant that is, if the ecological distribution follows a power law. In support of this idea, Sigman, Cecchi, Gilbert, and Magnasco (2001) reported that the spatial correlation in the response of collinearly-oriented filters to natural images indeed follows a power law. Elder and Goldberg (2002) asked human observers to label the sequence of elements forming the contours of natural images, with the aid of an interactive image editing tool, which allowed them to restrict the measurements to successive elements along the same contour. In contrast to earlier estimates by Sigman et al. (who did not apply this restriction), this method yielded a clear power law with exponent α = 2.92, very close to the estimate of the perceptual power law exponent α = 2.89 in Oyama s experiment. Thus, we have a strong indication that the human perceptual system is optimally tuned to the ecological statistics of proximity cues in natural scenes. Ecological data on good continuation have also emerged over the last decades. Kruger (1998) and Sigman et al. (2001) found evidence for collinearity, co-circularity and parallelism in the statistics of natural images. Geisler, Perry, Super, and Gallogly (2001) obtained similar results using both labeled and unlabeled natural image data, in fairly close correspondence with the tuning of human perception to the good continuation cue. Geisler et al. (2001) treated contours as unordered sets of oriented elements, measuring the statistics for pairs of contour elements on a common object boundary, regardless of whether these element pairs were close together or far apart on the object contour. In contrast, Elder and Goldberg (2002) modeled contours as ordered sequences of oriented elements, restricting measurements to adjacent pairs of oriented elements along the contours. The likelihood ratios for two oriented elements to be neighboring elements on the same object boundary are much larger for the sequential statistics, reflecting a stronger statistical association between neighboring contour elements. Elder and Goldberg also explored the ecological statistics of similarity in edge grouping, coding similarity in terms of the difference in brightness and in contrast between the edges, and found that the brightness cue carries useful information for grouping but the contrast cue is relatively weak. In sum, these studies have provided strong evidence for the ecological foundation of isolated grouping principles in natural images. However, an additional advantage in natural scenes is that disparate weak cues can often combine synergistically to yield strong evidence for a particular grouping. To explore this issue, Geisler et al. (2001) used a nonparametric statistical approach, jointly modeling the ecological statistics of proximity and good continuation cues as a 3-D histogram, to show that human observers combine these two classic Gestalt principles in a roughly optimal way. Elder and Goldberg (2002) demonstrated that the ecological statistics of proximity, good continuation, and similarity are roughly uncorrelated, so that the likelihood of a particular combined grouping can be computed as the product of the likelihoods of each separate grouping factor. Elder 16

17 and Goldberg s approach also allowed quantification of the statistical power of each Gestalt cue, as the reduction in the entropy of the grouping decision based on each individual cue. They found that proximity was by far the most powerful, reducing the entropy by roughly 75%, whereas good continuation and similarity reduced entropy by roughly 10% each. The most accurate grouping decisions could therefore be made by combining all of the cues optimally according to the probabilistic model, trained on the ecological statistics of natural images. Such a statistically optimal combination of grouping cues has also received some psychophysical support (Claessens & Wagemans, 2008). Although the Gestalt principles of grouping were largely based on the analysis of figures in the 2-D image plane, more recent work derives these principles from the geometric laws of 3-D projection, within the theoretical framework of minimal viewpoint invariants (Jacobs, 2003; Lowe, 1985). Briefly, the theory is based upon the assumption that the observer takes a general viewpoint position with respect to scenes. This assumption implies that certain so-called non-accidental properties in the 2-D proximal stimulus are most likely properties of the 3-D distal stimulus as well. Examples of such nonaccidental properties are proximity, good continuation, closure, convexity, parallelism, and symmetry (Lowe, 1985). This notion is closely related to Rock s (1983) ideas about the visual system s tendency to avoid interpretations of arrangements as coincidental (the so-called coincidence avoidance principle). It provides the statistical foundations for Minimal Model Theory of perceptual organization and objectness (see Feldman, 2003a, 2003b). The principle has also been applied in computer vision algorithms for perceptual grouping based on the so-called a contrario approach (e.g., Lezama et al., 2016). Last but not least, the principle has bridged the Gestalt grouping principles and object representation and recognition (Biederman, 1987). Measurement and quantification: 4. Structural Information Theory (SIT) A very substantial quantitative contribution to perceptual organization is made by Structural Information Theory or SIT which is considered to be the best-defined and most successful extension of Gestalt ideas (Palmer, 1999, p. 406). SIT provides a quantitative approach to one of the core principles of Gestalt thinking the simplicity principle or law of Prägnanz, conjectured to underlie all specific laws of perceptual organization (see above). From the very beginning, the simplicity (or minimum) principle the idea that the visual system organizes its visual world to be as simple as possible based on the available information was formulated in opposition to Helmholtz s likelihood principle the idea that the visual system interprets the incoming proximal stimuli in terms of their most likely distal source. In order to be able to derive quantitative predictions from the simplicity principle, one needs a formal theory to compute the complexity (or cost) of alternative perceptual descriptions. Based on ideas from information theory, which became popular in psychology in the 1950s and 1960s, Leeuwenberg (1969, 1971) proposed SIT as a coding model to describe visual (and other) patterns as sequences of symbols, which can then be reduced to their simplest descriptive codes codes which specify stimulus organizations by capturing a maximum of regularity. The idea is then, simply, that the visual system selects the organization with the shortest expression in the coding language that describes the possible organizations of the stimulus (i.e., the shortest code or the smallest information load in the terminology of SIT). So, the problem with the specification of the goodness of different patterns or shapes discussed above could now be solved by considering the length of the description of the patterns or shapes. After several decades of development, refinement, and expansion, two recent books have provided the current state of the 17

18 art regarding this theoretical approach. While the first book (Leeuwenberg & van der Helm, 2013) is focused on its application to form, the second one (van der Helm, 2014) is much broader, discussing the role of simplicity in vision at a much deeper theoretical level and expanding the realm of SIT as a multidisciplinary account of perceptual organization. The first book stays relatively close to Leeuwenberg s work. The first half provides the building blocks towards a theory of visual form, discussing its constraints and attributes, models and principles, assumptions and foundations. It also discusses the role of process versus representation in perception theories one of the central issues of debate surrounding SIT. Specifically, when SIT proposes that the perceived organization is the simplest one, it refers to the preferred pattern representation amongst all possible pattern representations. Hence, it employs a representation criterion, not a process criterion. It is theoretically silent (i.e., agnostic) about the underlying processes. Opponents have always argued that this reduces it to a mere methodological tool to generate and test predictions, excluding it from the realm of perceptual theories. Leeuwenberg and van der Helm defend their position by arguing for the primacy of representations. In their view, objects are the output of perception, not the input: The goal of perception is not to establish properties of given objects but to establish objects from properties of the given retinal image (Leeuwenberg & van der Helm, 2013, p. 1). With this view, they position themselves in the phenomenological tradition that characterizes the Gestalt approach (e.g., Albertazzi, 2015), which has been suppressed almost completely in contemporary work on perceptual organization within mainstream vision science (dominated by the computational tradition à la Marr) and visual neuroscience (dominated by the linear systems approach). One can regret that Leeuwenberg has never made a serious attempt to relate his ideas to the mainstream approach, because the impact of SIT could have been much stronger if he had, but the philosophical position he takes is one that can be defended. In addition, from a researcher s point of view, one can also justify this choice as a valid methodological stance: It is better to start perception research from what is most accessible (perceptual interpretations, phenomenal experiences) rather than from raw sensations (unstructured patches of stimulation at the retina, which are probably fundamentally inaccessible to conscious experience) or from perceptual processes (which are too rapid and effortless to leave traces in experience; see also Gilchrist, 2015, for further discussion). The second half of the book discusses the formal coding model and how it can be turned into a perceptual coding manual for line drawings, surfaces, and objects. It then discusses specific applications to explain (describe, predict) preference effects (occlusion, transparency, rivalry), time effects (perceived temporal order, perceived simultaneity), and hierarchy effects (superstructure dominance, mental rotation, reference frame effects). This part clearly illustrates how SIT works in practice and discusses a large number of empirical studies in which SIT has been applied successfully (e.g., form, perceptual ambiguity, amodal completion, neon color spreading, serial pattern production and completion, priming and masking of part-whole relationships, symmetry detection). Hence, this part also provides the necessary evidence for SIT s empirical scope, predictive success, and explanatory value. In the second book, van der Helm (2014) discusses the theoretical controversy between the simplicity and likelihood principles, and proposes a new kind of synthesis, not in the form of a unification but in the form of a juxtaposition between maximizing certainty and minimizing information (formulating one in terms of the other). This part deepens the earlier debate between Feldman (2009, 2015) and van der Helm (2011, 2015a). In another part, van der Helm provides a theoretical foundation to the coding rules applied by SIT. By relating them to the nature of visual 18

19 regularities, the so-called ISA-coding rules (Iteration, Symmetry and two kinds of Alternation) become instantiations of the fundamental principles of holographic regularity and transparent hierarchy. This theory is then applied to symmetry perception, discussing how the holographic approach is able to explain why some visual regularities are more salient (easier to detect, more robust to noise) than others (see also van der Helm, 2015b). In the final part, the author also presents a process model of perceptual organization, which computes the simplest hierarchical organizations of strings (as prescribed by the theory of holographic regularity) with an actual algorithm that allows for transparallel processing 5 by hyper-strings, namely, the coding algorithm PISA, for Parameter load plus ISA-rules (see also van der Helm, 2004). Van der Helm then explains how this algorithm can be regarded as a neurally plausible combination of feedforward feature extraction, horizontal feature binding, and recurrent feature selection, and speculates that neuronal synchronization (see below) can be regarded as a manifestation of transparallel processing. In sum, these two books summarize an extensive literature with empirical support on the role of simplicity in pattern and form perception, and they provide a coherent and deep underpinning of the role of simplicity in vision more generally. Future researchers interested in strengthening the quantitative approach to principles of perceptual organization will undoubtedly gain much inspiration from these works. Neurophysiological mechanisms: 1. Contextual modulations of single-cell responses The literature on the neurophysiological mechanisms of perceptual organization arose somewhat later than that on the quantification of the organization principles, but it is now clearly more extensive. Such an expansion (if not explosion) of perceptual organization studies from a neural perspective is somewhat surprising, if one considers the predominantly elementarist, even reductionist view which has long dominated visual neuroscience after the demise of electrical field theory by Lashley and Sperry (see above) and the rise in popularity of single-unit recording after Hubel and Wiesel s Nobel-prize winning work. Specifically, Hubel and Wiesel (1959) showed that single neurons in the Lateral Geniculate Nucleus (LGN) and in the striate cortex (Brodmann area 17, also called primary visual cortex or V1) of the cat were tuned to simple stimulus attributes (e.g., orientation, motion direction) and they were therefore interpreted as feature detectors (e.g., blob, line, edge detectors). These results led to what Barlow (1972) called the single neuron doctrine of perceptual psychology and visual neuroscience, consisting of two basic principles (if not dogmas): (1) our perceptions are caused by the activity of a small number of neurons and (2) the activity of a single neuron can be related to our subjective experience in a straightforward way. Based on this research program, which was subsequently carried out mainly in macaque monkey (including areas beyond V1, labelled V2, V3, V4, etc.), the predominant view of the visual system became one of functional specialization and hierarchical organization, with successive stages of processing with gradually decreasing retinotopy, increasing receptive field size, increasing selectivity of neurons, and increasing complexity of the features, across the sequence of cortical areas in the ventral stream (e.g., Maunsell & Newsome, 1987). In the 1990s, this view was further corroborated by functional imaging studies in humans, leading up to the notion of specialized modules and 5 This is a form of processing, in which multiple items are processed simultaneously by one processor. One can think of a hand holding a bundle of pencils with the points on a table top to verify whether all of them are equal. 19

20 cortical maps (e.g., Grill-Spector & Malach, 2004; Op de Beeck, Haushofer, & Kanwisher, 2008). All of this empirical work led to what I would call the mainstream view of visual neuroscience (see further discussion below), also shared by computational modelers proposing a predominantly feedforward architecture with successive feature extraction stages (e.g., Kubilius, Wagemans, & Op de Beeck, 2014; Riesenhuber & Poggio, 2002). In spite of the overwhelming dominance of this standard view, several single-cell studies in the 1980s and 1990s obtained results which seemed to be more compatible with the more global and interactive view of the Gestaltists. I briefly discuss three of these studies as somewhat canonical examples of this emerging new trend. For a more extensive review of this literature, see Albright and Stoner (2002). First, Allman, Miezin, and McGuinness (1985) recorded from the middle temporal (MT) area in the owl monkey, an area known to be selective for motion. When they stimulated the classical receptive field (crf) with motion of a random dot pattern in the preferred direction, they obtained the normal inverted U-shaped function for response strength plotted against motion direction (centered on the optimal direction). When they simultaneously stimulated a large surround region with a neighboring random dot pattern, the responses from these target cells in MT strongly depended on the surround stimulation, far beyond their crf. When the random dots in the surround moved in directions opposite to that of the crf, responses were enhanced by more than 50% and when they were in the same direction, responses were reduced by more than 50%. Apparently, the strength of the figureground segregation affected the neuron s response profile, indicating that a neuron is not just signaling an absolute (elemental) stimulus property within its receptive field, but rather a relational (Gestalt) property. A single neuron s activity is thus strongly influenced by its neighbors, raising doubts to the very notion of a receptive field (since it appears to stretch far beyond the area in the visual field from which the neuron can normally be stimulated with isolated stimuli). Second, in a seminal paper, von der Heydt, Peterhans, and Baumgartner (1984) showed that cells in area V2 in the visual cortex of the macaque monkey responded not only to physical line, bar or edge stimuli (as shown by Hubel and Wiesel) but also to so-called illusory (or subjective) contours. When two line segments were arranged in such a way that they were perceived as one single continuous line (with the gap between them being filled-in perceptually), the firing of typical V2 cells was almost half as strong as with an actual physical line, even though their receptive field was not actually stimulated by any physical discontinuity (no actual line or edge present). When the line segments were slightly repositioned, such that an illusory contour was not perceived, the firing rates dropped to the baseline level of spontaneous activity. A similar finding was obtained with an illusory edge of a white rectangle. When a vertical illusory contour was created by two abutting patterns consisting of horizontal line segments, the response strength even doubled at the target (vertical) orientation compared to an actual line, and the response function degraded smoothly away from vertical, as with normal orientation tuning functions. Although such a correspondence between a perceptual attribute and a cell s tuning properties may nowadays feel quite natural, it was found striking at the time that this tuning concerned a purely phenomenal, purely subjective characteristic, which arises from the processing of the configuration of the whole stimulus array, and not just a small, local, physical stimulus attribute, was quite striking at the time. 20

21 Third, the influence of context on responses of V1 neurons was convincingly demonstrated in a pioneering study by Lamme (1995), using a texture-segmentation task. In the original version of this paradigm, a macaque monkey was required to fixate a central fixation dot, when a full-screen texture composed of thousands of oriented line segments was presented. In the main conditions, the texture also contained a small square region made out of line segments of the orthogonal orientation (Figure 7, left), which is perceived as a figure segregated from the background. The monkey s task was to make an eye-movement towards the figure after the presentation of a go-cue (a task the monkey had learned to do quite well, >90% correct). With this paradigm it is possible to vary the position of the figure relative to the receptive fields of the neurons, while keeping their bottom-up activation constant. If the receptive field (RF) was on the figure boundary, the neurons responded much more strongly than when the RF was on the background, and this difference emerged around ms after stimulus onset (Figure 7, middle panel on the right). This difference, referred to as figureground modulation, occurs later than the difference in activation levels between optimal and nonoptimal texture orientations, which occurs around ms (Figure 7, upper panel on the right). If the RF was inside the figure, the neurons also responded more strongly than when the RF was on the background, but this difference emerged only around ms (Figure 7, lower panel on the right). A follow-up study showed that figures defined by other cues (color, motion, luminance, depth) produced similar patterns of results in V1 (Zipser, Lamme, & Schiller, 1996). This line of work has inspired additional empirical and theoretical work on the neural mechanisms involved in figureground segregation (discussed in a later section below). Figure 7. Neural activity in V1 cells as a function of optimal versus non-optimal orientations, and figure border and figure center versus background in texture displays (adapted from Lamme, 1995). 21

22 Neurophysiological mechanisms: 2. Association fields as a mechanism for contour integration As discussed in an earlier section above, when a neuron is excited by a stimulus presented inside its crf, stimulation of the surrounds of the crf can modulate the response of this neuron. This neural manifestation of contextual influences, which is called the extra-classical receptive field effect, has repeatedly been observed in V1. It seems to depend strongly on the contrast of the stimulus presented inside the crf (e.g., Levitt & Lund, 1997; Polat, Mizobe, Pettet, Kasamatsu, & Norcia, 1998). When the center stimulus has high contrast, the presence of the surround stimulus leads to a suppression of the spiking responses to the stimulus in the crf. The contextual suppression is generally highest when the center and surround stimuli have similar properties such as orientation, spatial frequency, and speed (e.g., Sillito, Grieve, Jones, Cudeiro, & Davis, 1995). However, when the surround stimuli are orthogonal to the high-contrast center stimulus, surround facilitation has been reported (e.g., Levitt & Lund, 1997; Sillito et al., 1995). Moreover, responses to a low-contrast stimulus inside the crf are also facilitated by a surrounding stimulus. Here, maximal facilitation has been found when the center and surround stimuli (e.g., bars or Gabor patches) are collinear, but this excitatory effect decreases with increasing distance between the center and surround stimuli (e.g., Kapadia, Ito, Gilbert, & Westheimer, 1995; W. Li, Piëch, & Gilbert, 2006; Polat & Sagi, 1993). Field, Hayes, and Hess (1993) have proposed that these collinear facilitation effects can be characterized by an association field mechanism a network of interactions between neighboring oriented line segments depending on their relative orientations and spatial positions. In such a network, excitatory interactions are strengthened with decreasing distance, curvature and deviation from co-circularity between the line segments. The shape of the association field has been found to closely resemble the co-occurrence statistics of edges in natural images (Elder & Goldberg, 2002; Geisler et al., 2001), providing support for the idea that the association field function is a mechanism evolved to optimize the interaction with the natural environment (see above). A network of long-range excitatory V1 connections described by an association field has been proposed to form the neural basis of contour integration, although it is likely that these connections are supported by top-down feedback from higher-level areas involved in shape detection, perceptual learning and attention (W. Li, Piëch, & Gilbert, 2008). Contour integration is the process by which elongated contours consisting of co-aligned elements are extracted from cluttered images, which is a crucial step for further scene segmentation and object recognition. Psychophysical experiments on contour integration have typically used a stimulus in which a contour consisting of collinear elements (often Gabor patches) is embedded in a background of randomly oriented but otherwise similar elements. Local density cues in the image are absent so that observers can only detect the contour based on the orientation relationships between the elements. Observers performance to detect such a contour is highly dependent on the average curvature or path angle of the contour, which is the average change in orientation between adjacent contour elements. In particular, observers ability to detect a contour is best when the contour forms a straight line and decreases as the degree of curvature of the contour increases (Field et al., 1993; Hess & Dakin, 1997; Watt, Ledgeway, & Dakin, 2008). Several computational models have implemented an association field concept to explain how the visual system extracts collinear contours from images (Ernst et al., 2012; Z. Li, 1998; Yen & Finkel, 1998). Typically, an association field model computes an association strength value for each oriented element in the image which determines how likely it is that the element belongs to a 22

23 contour (Watt et al., 2008). This has been found to explain the online dynamical properties of the eye-movement behavior during difficult snake detection tasks (Van Humbeeck et al., 2013). In sum, although association field models are definitely not the only possible models to capture the essence of this psychophysical and neurophysiological literature on contour integration, they are quite appealing because they allow for an intuitive blend with the co-occurrence statistics of edges in natural images. As a result, this has become a flourishing area of research (for a recent review, see Hess, May, & Dumoulin, 2015). Neurophysiological mechanisms: 3. Temporally correlated neural activity as a general mechanism of grouping In neural terms, the problem of perceptual grouping can be defined as the problem of identifying the neurons responding to the elements in the visual field that are grouped together (e.g., the features of a particular object) and as the other side of the same coin segregating them from the neurons responding to other elements in the visual field (e.g., the features of other objects or the background). This problem, which is known as the binding problem in the cognitive neuroscience literature, is a very general one and the proposed solutions are therefore also very general. One general solution is binding by convergence, basically the implementation of units that receive converging inputs from cells whose responses require integration. The anatomy of the visual cortical hierarchy (e.g., increasing receptive field size, increasing selectivity and tuning to larger features, parts, shapes or objects) seems to be in line with that. However, this solution suffers from essential limitations such as the combinatorial problem and the inherently limited flexibility. To address these problems, an alternative general mechanism has been proposed, namely, to represent grouped features and objects through the joint activity of neuronal populations or cell assemblies (an idea that goes back to Hebb, 1949), and in particular by temporal binding, basically the selection of neural responses from a distributed population of neurons by the synchronization of their activity (an idea proposed by Milner, 1974 and by von der Malsburg, 1981). Indeed, supposedly binding-related neural oscillations have been observed in the range of Hz, within as well as between local brain regions (Eckhorn et al., 1988; Gray, König, Engel, & Singer, 1989). Temporally correlated activity (incl. synchronization and oscillations) has been observed on all spatial and temporal scales in the mammalian brain (for reviews, see Buzsáki & Draguhn, 2004; Singer & Gray, 1995). As discussed further below, temporal correlation has become an important component in general theories of neuronal communication and cortical dynamics (Fries, 2005; Salinas & Sejnowski, 2001; Siegel, Donner, & Engel, 2012; van Leeuwen, 2015b; Varela, Lachaux, Rodriguez, & Martinerie, 2001). However, whether temporal correlation really solves the binding problem has remained controversial (e.g., Shadlen & Movshon, 1999). Neurophysiological mechanisms: 4. Figure-ground segregation In a recent review chapter, Self and Roelfsema (2015) summarized the evidence from studies that support the role of two key processes boundary detection and region growing in figure-ground segregation, and outlined a neural theory of figure-ground segregation through an interplay of feedforward, horizontal and feedback connections within the visual system. The empirical line of work started with the findings obtained by Lamme (1995), summarized above. In line with psychophysical studies (e.g., Mumford, Kosslyn, Hillger, & Herrnstein, 1987; Wolfson & Landy, 1998), 23

24 this evidence suggests that there are two complementary mechanisms at work in figure-ground segregation, each with their own connection schemes and timing. The first is boundary detection, the enhancement of the borders of the figure, which is achieved through a mixture of center-surround interactions mediated by feedforward anatomical connections and mutual inhibition between neurons tuned for similar features mediated by horizontal connections. In theory, orientation-defined texture boundaries could be detected by orientationopponent cells driven by one orientation in their center and the orthogonal orientation in their surround, but such cells have not been found yet. Therefore, it has been proposed that these edges are detected through mutual inhibition between neurons tuned for the same orientation. In such an iso-orientation inhibition scheme, the activity of neurons that code image regions with a homogeneous orientation is suppressed, and the amount of inhibition is smaller for neurons with RFs near a boundary, resulting in a higher firing rate. There is a good deal of evidence that iso-orientation suppression exists in visual cortex. For instance, cells in V1 that are well-driven by a line element of their preferred orientation are suppressed by placing line elements with a similar orientation in the nearby surround (Knierim & Van Essen, 1992). By themselves, these surrounding elements do not drive the cell to fire, and thus they are outside the crf of the V1 cells, but they strongly suppress the response of the cell to the center element. Importantly, this suppression is greatly reduced if the line elements outside the RF are rotated so that they are orthogonal to the preferred orientation of the cell (Sillito et al., 1995). This result supports the idea that V1 neurons receive an orientation-tuned form of suppression coming from regions surrounding the RF, and additional results support the rapid time-course of this suppression. The second process is region growing, which groups together regions of the image with similar features (e.g., line orientation). Computational models of texture segmentation stipulated that region-growing requires an entirely different connection scheme than boundary-detection (Poort et al., 2012; Roelfsema, Lamme, Spekreijse, & Bosch, 2002). Whereas boundary-detection requires isoorientation inhibition (as discussed above), region-growing requires iso-orientation excitation, which means that cells that represent similar features enhance each other s activity. Whereas boundarydetection algorithms use feedforward and horizontal connections, region-growing processes use feedback from higher to lower visual areas. This division was implemented in a computational model of texture-segmentation (Poort et al., 2012; Roelfsema et al., 2002), with feature-maps at multiple spatial scales in a multi-layer visual hierarchy. At each level of the hierarchy there was iso-orientation inhibition for the detection of edges, which implies that for any given figure-size there will be a level in the model hierarchy at which the figure pops out amongst distractors. Neurons at the higher level where pop-out occurred then send a feature-specific feedback signal back to earlier visual areas to enhance the response of neurons encoding the same feature and suppress the responses of neurons encoding the opposite feature. To restrict the enhanced activity to the interior of the figure, the feedback connections have to be gated by feedforward activity, so that only those cells that are activated by the feedforward sweep of activity are modulated by the feedback signal. An essential characteristic of this computational model is that the enhanced activity observed at the boundaries of the figure relies on mechanisms that differ from those for figure-ground modulation (FGM) at the center of the figure. In contrast, other research groups have suggested that FGM is strongly related to the mechanisms that underlie boundary-detection. Zhaoping Li (1999), for instance, presented a model where FGM arises exclusively through iso-orientation inhibition. 24

25 Another group (Rossi, Desimone, & Ungerleider, 2001) suggested that FGM could only be observed with very small figures (up to 2 in diameter), not in the center of larger figures. They suggested that FGM is in fact a boundary-detection signal that becomes greatly reduced as one moves away from the boundary (for more discussion, see Corthout & Supèr, 2004). Both of these viewpoints suggest that there is no region-growing signal present in V1 and that neural activity in V1 does not reflect surface perception, but rather the presence of nearby boundaries. Poort et al. (2012) reconciled these apparently conflicting findings by showing that region growing is only pronounced for behaviorally relevant objects, and is therefore essentially under top-down control of attention. In addition, they compared the onset latency of the FGM of neurons in V1 and V4, and showed that it is significantly shorter in V4, supporting the idea of a top-down influence. In four subsequent studies with novel recording and manipulation techniques, Roelfsema and his group have studied in more detail how these different processes are implemented in the laminar micro-circuitry of the visual cortex. First, Self et al. (2013) recorded simultaneously from all layers of V1 while monkeys performed the figure-ground segregation task introduced above. They found that the visual response started ms after stimulus presentation in layers 4 and 6, which are targets of feedforward connections from the LGN and distribute activity to the other layers. In addition, figure boundaries induced synaptic currents and stronger neuronal responses in upper layer 4 and the superficial layers around 70 ms after stimulus onset, consistent with the hypothesis that they are detected by horizontal connections. Another 30 ms later, synaptic inputs arrived in layers 1, 2, and 5 that receive feedback from higher visual areas, which caused the filling-in of the representation of the entire figure with enhanced neuronal activity. All of these results confirm the computational mechanisms proposed before (Poort et al., 2012; Roelfsema et al., 2002) and the temporal dynamics observed in Lamme s (1995) original study. In a second study, Self et al. (2012) addressed the question why feedback only modulates neural activity whereas feedforward projections drive neural responses, and in particular the possibility that feedforward and feedback projections utilize different glutamate receptors. One important glutamate receptor in cortex is the AMPA receptor (AMPA-R), which is a rapidly activated channel, well-suited to drive a neuron s membrane potential above threshold. The other principle glutamate receptor is the NMDA receptor (NMDA-R) with a more slowly opening channel, which is only active if the neuron is depolarized by AMPA-R activation. NMDA-Rs would therefore be well-placed to mediate the gating of feedback-based modulatory signals, as these receptors are unable to activate neurons that are not receiving synaptic input from other sources. Self et al. investigated the role that these different glutamate receptors play in the texture-segmentation task described above. Their hypothesis was that FGM would predominantly rely on NMDA-R activation and would be blocked by the application of NMDA-R antagonists. In contrast, they suggested that feedforward processing of the signal would rely on AMPA-R activation, but that these receptors would play no role in producing FGM. To address these hypotheses, they again made laminar recordings from V1 but the laminar electrodes now contained a fluid-line that allowed them to inject pharmacological substances into different layers of cortex. They used CNQX, an AMPA-R antagonist, and two NMDA-R antagonists, APV and ifenprodil, with different subunit specificity. APV is a broad-spectrum NMDA-R antagonist which blocks all NMDA-Rs, whereas ifenprodil is much more specific for NMDA receptors containing the NR2B subunit. In the texture-segregation task, the effects of the AMPA-R antagonist differed markedly from those of the NMDA-R antagonists. CNQX strongly reduced responses in an early response window ( ms after stimulus onset), which is mostly related to feedforward 25

26 activation. Remarkably though, CNQX had little or no effect on the level of FGM itself. In contrast, both NMDA-R antagonists strongly reduced FGM, whilst having opposing effects on the initial visually driven neural responses. APV reduced responses during the early time window, while ifenprodil actually increased responses in this period. However, both NMDA-blockers reduced FGM by similar amounts. These results provide support for the hypothesis that feedforward processing relies predominantly on AMPA-R activity, whereas FGM is carried mostly by NMDA-Rs. A third study investigated the role of low-frequency (alpha) and high-frequency (gamma) oscillations in relation to the different directions of information flow in monkey visual cortex. Van Kerkoerle et al. (2014) again recorded from all layers of V1 and found that gamma-waves are initiated in input layer 4 and propagate to the deep and superficial layers of cortex, whereas alpha-waves propagate in the opposite direction. In addition, simultaneous recordings from V1 and downstream area V4 confirmed that gamma- and alpha-waves propagate in the feedforward and feedback direction, respectively. Micro-stimulation in V1 elicited gamma-oscillations in V4, whereas micro-stimulation in V4 elicited alpha-oscillations in V1, thus providing causal evidence for the opposite propagation of these rhythms (for a more extensive discussion of the role of different cortical rhythms of activity, see below). Furthermore, blocking NMDA receptors, thought to be involved in feedback processing, suppressed alpha while boosting gamma. In a fourth study, Poort et al. (2016) focused more specifically on the question whether FGM in early and mid-level visual cortex is caused by an enhanced response to the figure, a suppressed response to the background, or both. Again, they studied neuronal activity in areas V1 and V4 in monkeys performing the same texture segregation task. They compared texture-defined figures with homogeneous textures and found an early enhancement of the figure representation, and a later suppression of the background. Importantly, across neurons, the strength of figure enhancement was independent of the strength of background suppression. As in the previous studies, they also examined activity in the different V1 layers and found that both figure enhancement and ground suppression were strongest in superficial and deep layers and weaker in layer 4. Furthermore, they examined the current-source density profiles, showing that figure enhancement was caused by stronger synaptic inputs in feedback-recipient layers 1, 2, and 5 and ground suppression by weaker inputs in these layers, which again confirms the important role for feedback connections from higher level areas. New grouping principles Traditional Gestalt psychology has sometimes been criticized for proposing a new law of perceptual grouping for every factor shown to have an influence. Already 10 years after Wertheimer s seminal grouping paper, Helson (1933) listed a set of 114 propositions (or laws ), which were argued to reflect the fundamental assumptions, claims and main findings of Gestalt psychology. Nevertheless, in the last few decades vision scientists have discovered even more principles 6. Of course, some of these could be considered as extensions of earlier ones, while others are truly novel. While some are 6 These new principles are no longer called laws but rather factors or cues. This reflects a more descriptive approach than the theoretical ambitions of traditional Gestalt psychology. The term cues also fits better with the mainstream view of Bayesian information processing (e.g., Feldman, 2015; Lee & Mumford, 2003; Yuille & Kersten, 2006), in which the goal is to infer so-called real-world properties from retinal-image properties (i.e., the inverse optics approach; see further discussion below). 26

27 mere demonstrations of nice effects, others may have deep implications for the nature of perceptual grouping and its role in perceptual organization more generally. We will focus on the latter type here. The principle of generalized common fate As argued before, Wertheimer s seminal paper did not report all of the experiments he and his collaborators had carried out to test variations of the basic principles. One clear case of this concerns variations on the principle of common fate, where he wrote: This principle too has a very wide area of application; how wide will not be dealt with here (Wertheimer, 1923/2012, pp. 316/p. 144). It is in this spirit that Sekuler and Bennett (2001) presented an extension of common fate to grouping by common luminance changes. When elements with different luminance values become brighter or darker simultaneously, observers have a strong tendency to group those elements perceptually. It is as if the principle of common fate not only operates for the common motion of elements through physical space, but through luminance space as well. For this reason, Sekuler and Bennett have called this the principle of generalized common fate. In a sense, it is a variation on the theme of grouping by similarity, based on similarity of changes in feature values, such as luminance or position, rather than on the similarity of the feature values themselves. The ecological basis of it might lie in the simultaneous brightening or darkening across an extended spatial area due to changes in the level of illumination (e.g., sunlight or shadows; see also van den Berg, Kubovy, & Schirillo, 2011). The principle of grouping by synchrony The principles of common fate and generalized common fate capture how commonalities in the direction of motion or luminance (or any other feature value for that matter) can determine grouping. The changes do not have to be consistently in the same direction, however. A random field of black and white dots changing to white and black randomly over time, for instance, will segregate into two distinct regions if the dots in one area change synchronously rather than randomly (Alais, Blake & Lee, 1998; Lee & Blake, 1999). This principle of grouping by synchrony the tendency for elements that change simultaneously to be grouped together can be considered as an even more general form of common fate in which the simultaneous changes do not have to involve either motion, as in classic common fate, or common direction of change, as in generalized common fate. Apparently, the simultaneous occurrence of visible changes of the elements constitutes a sufficient basis for grouping them perceptually. Simultaneity of change turns out to be a strong temporal regularity in the stimulus event. Synchrony grouping may therefore reflect a very general mechanism of non-accidentalness detection, possibly connected to the perception of causality (e.g., Michotte, 1946/1963). Temporal coincidence of multiple changes is unlikely to be a matter of chance, and so it must have some common underlying cause. However, objects in the natural environment rarely change their properties in different directions or along different dimensions in temporal synchrony. Hence, there is a remaining controversy whether there is such a thing as pure synchrony grouping, which cannot be computed on the basis of static image differences at any single moment in time but must instead be computed on the basis of higher-order statistics of image changes over time (e.g., Farid, 2002; Farid & Adelson, 2001; Guttman, Gilroy, & Blake, 2007; Leonards, Singer, & Fahle, 1996; Usher & Donnelly, 1998). Another controversial issue is to what extent temporal synchrony of changes drives grouping because synchrony of neural firings is the physiological mechanism by which the brain codes all forms of grouping (see above). 27

28 The principle of common region The principle of common region is the tendency for distinct elements to be grouped together when they lie within the same bounded region (Palmer, 1992). An illustration is provided in Figure 8, where black dots that lie at equal distance (Figure 8A) become grouped in pairs that lie inside of a rectangle (Figure 8B). It is a rather strong principle, as it seems to overrule grouping by similarity (Figure 8C) and proximity (Figure 8D). In a sense, it is a variation on the theme of grouping by similarity, namely similarity of containment, but it has an intuitive structural and ecological basis. The structural basis for grouping by common region is that all elements within a given region share the topological property of being inside of or contained by some surrounding borderline (for more work on the important role of topological properties in vision, see Chen, 1982, 2005). Common region also appears to make good ecological sense: When a boundary encloses some elements, they are likely to belong to the surface of a single object (e.g., spots on a leopard s skin, features that are parts of a face), rather than independent objects that just accidentally lie within the same bounding contour. In a sense, the principle of common region reflects the primacy of surfaces in the way we organize our visual world (e.g., Nakayama, He, & Shimojo, 1995). Figure 8. The principle of common region (adapted from Palmer, 1992). (A) No grouping occurs when all dots are equal. (B) Grouping by common region occurs when dots lie inside some boundaries that delineate specific regions of space. (C) Grouping by common region overrules grouping by similarity. (D) Grouping by common region overrules grouping by proximity. The principle of element connectedness The principle of element connectedness is the tendency for distinct elements to be grouped together when they are somehow connected (Palmer & Rock, 1994). An illustration is provided in Figure 9, where black dots that lie at equal distance (Figure 9A) become grouped in pairs when connected by a line segment (Figure 9B). Again, it is a strong principle, as it easily overrules grouping by proximity (compare Figure 9D to 9C). It is easy to show that element connectedness does not require the elements to have the same luminance, color or texture. So, the structural basis of this grouping principle is the topological property of connectedness (i.e., sharing a common border), which is 28

29 rooted in ecological reality: Pieces of matter that are physically connected to each other are likely to be parts of the same object, because they tend to behave as a single unit. In a sense, connectedness can be considered as the limiting case of the classic factor of proximity, but Palmer and Rock (1994) argued that connectedness is more primary: It is only by breaking the connectedness that one obtains distinct units, separated by a certain distance, which then causes grouping strength to decay exponentially as distance increases linearly. Figure 9. The principle of element connectedness (adapted from Palmer & Rock, 1994). Connecting equal dots at equal inter-dot distances (A) by line segments leads to their grouping (B). Even connecting equal dots at unequal inter-dot distances (C) leads to their grouping (D), demonstrating that the principle of element connectedness overrules grouping by proximity. The principle of uniform connectedness The observation above triggered the question where the to-be-grouped elements come from in the first place. Palmer and Rock (1994) argued that they arise from the earlier organizational process of uniform connectedness (UC), which is the principle by which the visual system initially partitions an image into a set of mutually exclusive connected regions having uniform (or smoothly changing) properties, such as luminance, color, texture, motion, and depth. According to Palmer and Rock, the UC elements that are created in this way are the entry-level units of a hierarchical perceptual organization system, in the sense that they constitute the starting point of all subsequent processes that are grouping or parsing the different UC regions. However, this proposal has remained somewhat controversial. Peterson (1994), for instance, argued that UC is only one of many properties relevant to partitioning the visual field, and that UC units do not necessarily act as entrylevel units. Kimchi (2000) examined the role of UC in experiments designed to reveal the gradual emergence (or microgenesis) of organizational processes, and found that collinearity and closure were at least as important as UC to determine the entry-level units in a part-whole hierarchy. Other experiments showed that proximity may operate faster than UC under some circumstances (Han, 29

30 Humphreys, & Chen, 1999). Nevertheless, the idea that some organizational process like UC creates a set of potential perceptual units on which further grouping and parsing can operate appears to be sound. The principle of induced grouping The principle of induced grouping (see Figure 10) is the tendency for ungrouped elements (Figure 10A) to become grouped when there are similar arrangements in the surround that are grouped by any of the more standard principles of grouping, such as proximity (Figure 10B) or element connectedness (Figure 10C). Although the effect is phenomenologically clear and convincing, the question is whether this induction effect occurs automatically or as a consequence of attention shifts or intentional verification strategies by the observer. To answer this question, Vickery (2008) relied on an objective behavioral method to indirectly measure the effect of grouping, independently of demand characteristics (i.e., the RDT described above). The results demonstrated clearly that grouping can be induced by proximity, similarity, and common fate. In subsequent work, the principle of induced grouping was extended to other forms of associative grouping, based on associations with previously learned groupings (Vickery & Jiang, 2009). Figure 10. The principle of induced grouping (adapted from Vickery, 2008). Equal dots at equal inter-dot distances (A) become grouped in pairs when neighboring dots are grouped in pairs by proximity (B) or element connectedness (C). New principles of figure-ground organization The Gestalt psychologists demonstrated that a variety of image properties such as small area, convexity, symmetry, and surroundedness (or enclosure) were sufficient for the perception of a figure against a background, without the need of familiarity. These image properties became known as the classic configural principles of figure-ground organization (e.g., Harrower, 1936; Rubin, 1915). All of these concern the organization of displays consisting of static, homogeneously colored regions. Quite a few additional principles of figure-ground organization have been discovered in the last decades, which I will briefly discuss below. While some of these also apply to static, homogeneously colored regions (e.g., lower region, and top-bottom polarity), additional figure-ground principles come into play in displays containing spatial heterogeneities such as texture (extremal edges and 30

31 edge-region grouping) and when they contain motion (advancing region, articulating concavities, and convex motion). As with the new grouping principles, researchers have often provided an ecological foundation of these new figure-ground principles too. Lower region Vecera, Vogel, and Woodman (2002) showed that when a rectangular display is divided in half by an articulated horizontal border, the region below the border is more likely to be perceived as the closer by, figural, region than the one above the border. They found this lower-region preference most strongly in images in which the border consisted of horizontal and vertical segments, but also in analogous images consisting of curved segments. Vecera (2004) performed additional experiments in which such displays were viewed by observers whose heads were tilted (or even inverted) to determine whether this figure-ground principle was driven by a viewer-centered, retinal reference frame or an environmental reference frame. The results showed that retinal directions were clearly dominant, which may be surprising because it appears at odds with the presumed ecological justification of gravity as the rationale for perceiving lower regions as closer (see Vecera & Palmer, 2006). However, it is consistent with the need to compute information about figure-ground status early in visual processing, before orientation constancy has occurred. Moreover, because head orientation is approximately vertical most of the time, the difference between retinal and environmental reference frames is negligible in practice anyway. The ecological validity of lower region was assessed statistically by analyzing a corpus of photographic images that were handsegmented by human observers (Martin, Fowlkes, Tal, & Malik, 2001). The results showed that lower region was a valid cue to closer by surfaces for local edges whose orientation was roughly horizontal. Top-bottom polarity Hulleman and Humphreys (2004) showed that regions that are wider at the bottom and narrower at the top are more likely to be perceived as figures than regions that are wider at the top and narrower at the bottom. The regions in their displays looked a bit like odd evergreen trees or chess pieces, but they argued that their effects were not due to the effects of familiar shape (e.g., Peterson & Gibson, 1993) because there are other familiar objects (e.g., tornados) that are similar in shape to the regions with narrow bases and wide tops. They also claim that top-bottom polarity effects cannot be explained by the effects of lower region (Vecera et al., 2002). Nevertheless, the ecological factor that links all three of these figure-ground factors (canonical orientation of familiar shapes, lower region, and top-bottom polarity) is gravity. Indeed, top-bottom-polarity can easily be interpreted as a perceptual consequence of gravitational stability. Extremal edges and gradient cuts An extremal edge (EE) in an image is a projection of a viewpoint-specific borderline of self-occlusion on a smooth convex surface, such as the straight side of a cylinder. Computational analyses of the visual properties of surfaces near EEs show characteristic gradients of shading and/or texture in which contours of equal luminance and/or density are approximately parallel to the edge of the surface (Huggins & Zucker, 2001). EEs are relevant to figure-ground determination because the side with an EE gradient is almost invariably perceived as being closer to the observer than the opposite side of the edge (Palmer & Ghose, 2008), even when EE is placed in conflict with other factors (Ghose & Palmer, 2010). 31

32 Edge-region grouping Palmer and Brooks (2008) pointed out that classic grouping effects and figure-ground organization must be intimately connected: If figure-ground organization is determined by an edge belonging to (i.e., grouping with) the region on one side more strongly than that on the other, then any grouping factor that could relate an edge to a region should also operate as a figure-ground factor. They tested this hypothesis for six different grouping factors that were well-defined both for edges and regions proximity, similarity of orientation, color and blur, common fate, and flicker synchrony and found that all six factors showed figure-ground effects in the predicted direction, albeit to widely varying degrees. Articulating motion Barenholtz and Feldman (2006) demonstrated a dynamic principle of figure-ground organization: When a contour deforms dynamically, observers tend to assign figure and ground in such a way that the articulating (or hinging) vertices have negative (concave) curvature. This ensures that the figure side is perceived as containing rigid parts that are articulating about their part boundaries. In their experiments, this articulating-concavity bias was shown to override traditional static factors (such as convexity or symmetry) in cases where they made opposing predictions. In other experiments, Barenholtz (2010) showed that when a contour segment that is concave on one side and convex on the other deforms dynamically, observers tend to assign the figure on the convex rather than the concave side. More recently, Froyen, Feldman and Singh (2013) studied rotating columns as another interesting phenomenon of interactions between structure-from-motion, dynamic accretion and deletion at borders, and figure-ground organization. Advancing region motion Barenholtz and Tarr (2009) showed that moving a border within a delimited space such that the bounded area on one side grows larger and the bounded area on the other side shrinks in size causes the growing area to be perceived as a figure advancing onto the shrinking area. Thus, motion in an advancing region overpowers the classic Gestalt factor of small area. Contour entropy as a determinant of ground or hole All of the above research is aimed at finding factors that determine the perception of a figure against a background. In contrast, Gillam and Grove (2011) asked whether there are also factors that strengthen the perception of a region as ground. They reasoned that occlusion by a nearer surface will usually introduce a regularity among terminations of contours at the occluding edge, which will be perceived as a stronger cue to occlusion when the irregularity of the elements is higher. In other words, when the lines being terminated are more disordered, the strength of the occlusion cue (called order within disorder or entropy contrast ) is larger. In three experiments with a figureground paradigm, they showed that unrelated (high entropy) lines appeared as ground (or holes ) more often than more ordered (low entropy) lines. Non-image-based influences on figure-ground perception Another substantial part of the literature on figure-ground organization in the last two or three decades concerns influences on figure-ground organization that are not in the images themselves. 32

33 These non-image-based influences are mainly past experience (or familiarity), perceptual set (or context effects), and attention. This literature has made substantial use of indirect measures (e.g., reaction times) to circumvent the limitations of direct reports to address early and automatic stages of processes (see above). As emphasized earlier, the Gestalt pioneers were already open to these factors, so this research has not discovered any new factors, but it has provided solid experimental evidence for their importance. In addition, this literature has converged on the moderate view that figure-ground organization can occur preattentively, while it can also be affected by attention. For an extensive recent review of this literature, see Peterson (2015). New Directions for Research on Perceptual Organization Refining the Concept of Perceptual Grouping Perceptual grouping is often treated as if it is a single process but it is not. In different areas of perception research, authors imply a great variety of conceptually distinct processes when they refer to grouping. This type of confusion must be avoided if we want to make further progress. A good example is the recent claim that crowding the impairment of peripheral target perception by nearby flankers is a kind of grouping. As grouping, crowding is a form of integration over space as target features are spuriously combined with flanker features. In addition, the spacing between the target and the flankers is one of the most important factors that determine crowding and crowding strongly depends on the similarity between the target and the flankers. Of course, dependence on spacing and dependence on similarity is also a hallmark of grouping (i.e., grouping by proximity and grouping by similarity, reviewed above), but is this enough to equate crowding with grouping? Relations between crowding and grouping have been reported in a number of crowding studies (for a recent review, see Herzog, Sayim, Chicherov, & Manassi, 2015). For example, the more the target groups with the flankers the stronger the crowding (e.g., Sayim, Westheimer, & Herzog, 2010). Moreover, the more a target is judged to stand out from the flankers (i.e., to ungroup from the flankers), the weaker the crowding (e.g., Saarela, Sayim, Westheimer, & Herzog, 2009). Importantly, these studies indicate that grouping and crowding are clearly interrelated, in the sense that the factors that determine both of them may be overlapping. But in order to know the mechanisms underlying both grouping and crowding (and the extent they overlap), we have to become more specific about the kind of grouping we think is involved. This requires a more refined vocabulary. Ungrouping grouping Of course, I am not the first to make a distinction between different kinds of grouping. Several useful distinctions have been made before, but they are usually limited to binary classifications. For example, Zucker (1985) proposed a differential-geometric approach to automatically extract orientation information from images, and developed biologically plausible algorithms to infer vector fields of tangents to contours. He conceived this as a matching problem and showed that two different matching processes are required: Type I processes for 1-D contours and Type II processes for 2-D flows, with the numbers referring to the dimensionality of support of the tangent fields. He also speculated that this difference is reflected in the response properties of simple and complex cells, respectively. 33

34 Second, Watt and Phillips (2000) distinguished between what they called static and dynamic grouping. Static grouping refers to simple forms of grouping in which cells (e.g., V1 simple cells) combine activity from a pre-specified set of inputs (e.g., the crfs) in a pre-specified manner (e.g., the pattern of excitation and inhibition across the crfs) to compute whatever feature they signal. When cascaded, this form of grouping produces feature hierarchies that have the ability to classify input into pre-specified output categories. However, this style of processing cannot produce novel outputs when required by novel inputs, nor can it create new outputs for new tasks, because the possible outputs are limited to the pre-specified output categories. When the sources of information are not pre-determined or the manner of combination is not pre-specified, dynamic grouping is required. This is a fundamentally different form of grouping because it can respond to novel inputs to produce novel outputs. The nature of the output of dynamic grouping is determined by the interaction of the organizational processes and the current input rather than being limited to a restricted set of categories. Static and dynamic grouping can be combined flexibly to create novel feature descriptions at intermediate and higher levels in the hierarchy. Building on the extensive literature on neural synchrony (reviewed below), Watt and Phillips also speculated that dynamic grouping is signaled by synchronizing the neural activity to be grouped. I believe this distinction maps onto the distinction that Roelfsema (2006) has made between basegrouping and incremental grouping. In his view, base-groupings are coded by single neurons tuned to multiple features, like the combination of a color and an orientation. They are computed rapidly because they reflect the selectivity of feedforward connections. However, because not all conceivable feature combinations are coded by dedicated neurons, another, more flexible form of grouping is required, which Roelfsema called incremental grouping. Incremental grouping enhances the responses of neurons coding features that are bound in perception, but it takes more time than does base-grouping because it relies also on horizontal and feedback connections. While basegrouping can occur preattentively and operates in parallel across the whole visual field, incremental grouping requires attention and operates serially across spatially separated (often pairwise neighboring) locations in the visual field. A good example of the latter case is curve tracing, which has been investigated extensively in Roelfsema s lab (e.g., Roelfsema, Lamme, & Spekreijse, 1998; Wannig, Stanisor, & Roelfsema, 2011). Refining the above distinctions and building on my own work, I propose to distinguish five kinds of grouping, because they all seem to indicate different processes, with their own mode of operation, perhaps even with their own distinct underlying mechanisms. Specifically, instead of the general notion of perceptual grouping, I propose to use terms like clustering, segregating, linking, layering, and configuring. The distinctions I make here are rather tentative at this point, intended to inspire further research with a refined conceptual framework. More distinctions may be needed eventually. With clustering, I refer to the process of treating individual items as members of a larger ensemble, basically extracting their common feature and ignoring others. In Figure 11.1, for instance, all Gabor elements have the same orientation and are easily treated as one cluster of elements. This process typically occurs in texture grouping or ensemble encoding, when there is no segregation. With segregating, I refer to the process of treating one subset of items as members of a larger set, while at the same time also distinguishing this set from another subset of items. This process happens, for instance, when one segregates Gabor elements with one (quasi-)uniform orientation 34

35 from Gabor elements that have another, clearly distinct (quasi-)uniform orientation or that have a random orientation (see Figures 11.2 and 11.3, resp.). Figure 11. Illustration of five different forms of grouping: clustering, segregating, linking, layering, and configuring (see text for explanation). With linking, I refer to the process of connecting individual items in specific ways, often as a sequential spreading of pair-wise couplings. The prototypical example of this process is what happens in the snake detection paradigm (discussed above), but also the more basic cases discussed by Wertheimer (1923; see Figures 1-2-3) concern this type of grouping. I believe linking is a very general process that can occur at different levels in the cortical hierarchy depending on the nature of the stimulus, from rather simple cases (e.g., pairwise grouping of dots, dots in rows or columns, oriented line segments with some degree of good continuation), to cases of linking between higher-order units (e.g., establishing the orientation of the borderline between two segregated regions as in Figures 11.2 and 11.3). We have studied the difference between segregating and linking, and the different kinds of linking in psychophysical and neuropsychological experiments 35

36 (see Vancleef et al., 2013a; Vancleef & Wagemans, 2013; Vancleef, Wagemans, & Humphreys, 2013b). With layering, I refer to the process of segregating two sets of items, with an additional indication of which subset is figure and which is ground. So, this process requires a determination of borderownership and depth-order (discussed further below). It is meant to be somewhat more general than figure-ground assignment because it also deals with multiple layers and layers with different levels of occlusion (fully opaque versus some degree of transparency). A good example is shown in Figure 11.4, in which the central elements all have the same orientation, which leads to the segregation of this region from the background, but because this region is completely surrounded by random elements (i.e., it is enclosed), it gets figural status as well. Note that the borderline that segregates this region from the background is only implicit in this case. In Figure 11.5 there is a series of Gabor elements that are linked by good continuation and closure, but because the elements inside are all in random orientations (as is the case for the background elements too), this linked group is not clearly seen as a surface (it could be just a closed string of Gabor elements). When the region (Figure 11.4) and the closed string (Figure 11.5) are combined (Figure 11.6), the central region is now clearly seen as a figure against a background with a homogeneous surface texture and a clear borderline owned by the central region (i.e., proper and articulate layering). We have studied the nature of the combination process in psychophysical experiments (Machilsen & Wagemans, 2011). A special case of layering pertains to transparency (for a recent review, see Gerbino, 2015). With configuring, I refer to the process of organizing individual items in larger, structured wholes or Gestalts with configural properties. Linking Gabor elements into a closed shape can lead to configurations with overall shape properties like symmetry (Figure 11.7), high degrees of regularity, simplicity, or goodness (Figure 11.8), or even familiar objects with a structural description, containing a shape skeleton or a hierarchical part-based description (e.g., horse with its trunk, legs, head, and tail; see Figure 11.9). The configural properties can belong to all kinds of groups (patterns, strings, borderlines, surfaces, shapes, objects, scenes, events, etc.). We have investigated the added value of a configural property such as symmetry for shape detection based on Gabor elements with variable degrees of good continuation (Machilsen, Pauwels, & Wagemans, 2009). The above distinctions are essentially phenomenological but they seem to indicate different underlying processes, which may be implemented by different neural mechanisms. For instance, clustering relies on pooled activity in a low-level area (e.g., V1), no further distinctions are made within the group (all elements belonging to the cluster are treated the same, even when they vary a bit), and it occurs automatically, probably depending on fast feedforward processes only, with little or no feedback from higher order areas. The linking which occurs in snake detection, instead, seems to be a slower process with some kind of serial or bi-serial spreading of activation (element-toelement). Feedback is possible when the process is rather slow (with noisy or interrupted input), but not always necessary (e.g., Schmidt & Vancleef, 2016). Configuring, on the other hand, is an essentially bi-directional form of grouping (element-to-group and back), which probably occurs in higher areas of the visual cortex and involves highly interactive mechanisms. Of course, in more realistic images, several of these grouping processes would occur simultaneously and they will undoubtedly interact. Figure 12 illustrates this point. An image of a natural scene such as a field of grass, bushes, crops, trees, and sky, addresses several of the processes introduced above: 36

37 clustering of the leaves of the bushes, segregating the different regions, linking the elements along the borderlines between the regions, layering of the trees against the grass and sky, and perhaps some hierarchical patterning of the ploughed field as well. The butterfly image illustrates that rich images also involve intermediate cases of grouping (e.g., segregating and linking of borderline elements and surface elements, clusters of flowlines, hierarchical patterns of flowlines, symmetric patterns of Gabor elements, symmetry of shape) and entail an intricate interplay between them (which could occur at different time-points after stimulus onset). This makes it difficult to disentangle the different forms of grouping in cases of perceptual grouping in everyday life, but that does not imply that the distinctions are not useful to refine our experimental research paradigms (stimulus parameters, control conditions, tasks) and our discussion of the phenomena, principles, and mechanisms. We have created a whole series of different Gaborized object outlines and textures to enable an investigation of these interactions in detection and identification tasks (Sassi, Machilsen, & Wagemans, 2010; Sassi et al., 2010), also in dynamical series of similar stimuli, in which structures and objects emerge gradually over time (Evers et al., 2014). Moreover, we have studied the neural dynamics between the different kinds of grouping in a combined EEG-fMRI study (Mijovic et al., 2014). Figure 12. Illustration of how the different forms of grouping apply, intermingle, and interact in more realistic images (see text for explanation). Not all Gestalts are equal Gestalt psychologists were keen on making distinctions between different kinds of Gestalts but, because they also theorized about fundamental Gestalt principles underlying all of them (e.g., Prägnanz), the distinctions were sometimes lost in later uses of the term Gestalt. I think it is important to emphasize that not all Gestalts are equal. They have different phenomenological characteristics (i.e., they are experienced as being different) and the underlying mechanisms might be different too. Early Gestalt schools made a distinction between Gestalts as emergent properties arising over and above the combined elements (e.g., the example of a melody, which is more than the sum of the individual tones; von Ehrenfels, 1890) and Gestalts as autonomous higher-order 37

38 entities, resulting from self-organization processes in the brain as a dynamical system, not produced by the combination of lower-order entities. The latter kind of Gestalts were emphasized by the Berlin school of Gestalt psychology (Wertheimer, Koffka, Köhler), who opposed themselves against the Graz school (Meinong, von Ehrenfels, Benussi). A key point of distinction was between one-sided and twosided dependency in the relationship between parts (elements, features, components) and wholes (patterns, objects, Gestalts): In the Graz school the wholes depend on the arrangement of the parts (i.e., emergent Gestalt qualities that are more than the sum of the parts) but not the other way around. This is one-sided dependency. In the Berlin school there are also wholes that do not depend on the parts (e.g., phi motion) and the parts can depend strongly on the wholes within which they are embedded (i.e., Gestalts as different from the sum of the parts). This is two-sided dependency. (For a more extensive discussion of this early Gestalt history, see Wagemans, 2015a.) Köhler (1920) also made an important distinction between so-called strong and weak Gestalts. He proposed to treat the neurophysiological processes underlying Gestalt phenomena in terms of the physics of field continua rather than that of particles or point-masses. In such continuous field systems, which he called strong Gestalts, the mutual dependence among the parts is so great that no displacement or change of state can occur without influencing all the other parts of the system. Köhler showed that stationary electric currents, heat currents, and all phenomena of flow are strong Gestalts in this sense. He distinguished these from what he called weak Gestalts, which do not show this mutual interdependence. Obviously, he also applied this distinction to perceptual Gestalts Gestalts as we perceived them, Gestalts in our subjective experience. For Köhler, in strong Gestalts the mutual dependence among the parts is so great that nothing in the arrangement can be changed without influencing all the other parts of the system (for examples, see Figures 5 and 6). In fact, sometimes there are no parts at all, only interacting moments of structure (e.g., phi motion). I propose to refine this conceptual framework in terms of what we currently know about the visual cortical hierarchy. In general, I argue that there are different types of Gestalts, each with their own relationships between parts and wholes, both in visual experience and in their neural encoding. Specifically, I make a distinction between Gestalts in different levels of the visual system and with different implications for the status of the parts. Some Gestalts seem to be encoded in low-level areas based on feedback from higher-order regions, while other Gestalts seem to be encoded in higher-level areas, with the parts being encoded in lower-level areas. The first type of Gestalts I propose to call lower-level Gestalts. Two good examples are the following. First, Kourtzi et al. (2003) carried out two experiments using an fmri adaptation paradigm, with stimuli consisting of line elements and monkeys as subjects in Experiment 1, and stimuli consisting of Gabor arrays and humans as participants in Experiment 2. In both experiments the change from random orientations to aligned orientations with good shape properties gave rise to more recovery from adaptation than the change from one set of random orientations to another random one, not only in higher-order areas (such as the Lateral Occipital Complex or LOC in humans) but also in low-level areas like peripheral V1 and central V2, where the RFs are large enough to integrate the local elements into (parts of) contours belonging to the global shapes and not too large so that the RFs do not contain too many noise elements from the background. Second, in an fmri study with human participants, Murray, Boyaci, and Kersten (2006) studied what happens to the retinotopic map in V1 when two objects that project the same visual angle on the retina appear larger or smaller depending on the perceived viewing distance. A distant object that appeared to 38

39 occupy a larger portion of the visual field activated a larger area in V1 than an object of equal angular size that was perceived to be closer and smaller. These findings suggest that the retinal size of an object and the depth information in a scene are combined early in the human visual system. Because perceived size is a relational, higher-order property, this study seems to indicate that at least some Gestalts are coded in low-level visual areas. Both of these cases are likely due to feedback from higher-order areas, although the techniques used in these studies do not allow to confirm this. There are other cases like this but more typically one finds evidence for Gestalts in higher-level areas, with the parts being encoded in lower-level areas. I propose to distinguish two fundamentally different kinds of these higher-level Gestalts, depending on what happens to the parts in lower-level areas. In so-called preservative Gestalts functional wholes arise spontaneously and parts become less functional, but the encoding of these wholes at higher levels of the cortical hierarchy does not suppress the encoding of the parts. In so-called eliminative Gestalts wholes dominate and parts disappear from experience; wholes emerge in higher areas of the brain and encoding of parts is then suppressed. I will also give some examples of each of these to illustrate what I mean. An excellent example of preservative Gestalts is what happens in the case of the configural superiority effect or CSE (Pomerantz, Sager, & Stoever, 1977), which has been used as an index to indicate when wholes are perceived before parts ( forest before trees ). The easiest test for CSEs starts with benchmarking performance in a baseline task of localizing a singleton (or odd one out ) in a search display, that is, finding a single target element in a display otherwise consisting of identical distractor elements (e.g., a line segment tilted to the left amongst line segments tilted to the right; see Figure 13A, left). Then an identical context stimulus is added to each element (e.g., an L-shape; see Figure 13A, middle). Normally, adding identical, non-informative context hurts performance because it makes the stimuli more similar (in addition to increasing overall processing load and possibly introducing masking or crowding). In the case of tilted line segments and Ls, arrows and triangles are formed (see Figure 13A, right), and perceivers are more than twice as fast to find the target. When these same parts are shifted just slightly in position, however, the CSE is lost (Figure 13B). Similar effects arise with pairs of parentheses (Figure 13C and D). Using fmri with multi-voxel pattern analysis (Kubilius, Wagemans, & Op de Beeck, 2011) we investigated what happens in different regions of the brain, when participants are doing this odd-one out task with parts displays (consisting of individual line segments) and wholes displays (consisting of arrows and triangles). The regions of interest (localized in the standard way) were lower-level retinotopic regions like V1 and V2 versus higher-level regions specialized in integrated objects (i.e., LOC). We replicated the behavioral CSE and also found a neural CSE, that is, better decoding performance for the wholes than for the parts in the higher shape-selective regions. Crucially, however, we also obtained better decoding performance for the parts than for the wholes in the lower-level retinotopic regions. This means that even in the case of clear behavioral indications of a strong Gestalt (CSE) coded in higher-order areas, the components that make up the Gestalt (line segments) and their attributes (orientation) are still preserved in the lower-order areas, which is why we call this a preservative Gestalt. 39

40 Figure 13. Emergent features in visual search, demonstrating configural superiority (adapted from Pomerantz et al., 1977). Adding redundant elements to each of the stimuli improves detection of the odd element in the display, but only when certain configurations arise (such as closure in A or symmetry in C). An excellent example of eliminative Gestalts is what happens in the case of the bi-stable diamond, which can be perceived either as diagonally oriented line segments moving up and down, or as an integrated shape (a diamond) moving left and right, depending on the properties of the terminators (Lorenceau & Shiffrar, 1992). Fang, Kersten, and Murray (2008) carried out an fmri study to investigate what happens to the neural activity in lower- and higher-order areas, in relation to the local line ( parts ) and global shape ( whole ) percepts (relying on a method previously introduced by Murray et al., 2002). What they found was quite striking: Activity in V1 (indicated by BOLD) was high when perceivers reported seeing the line segments moving up and down, and low when they reported seeing the integrated diamond moving left and right. In contrast, activity in LOC was high when perceiving the diamond and low when perceiving the line segments. This inverse pattern of activity is a clear example in which lower-level activity is suppressed when higher-order Gestalts are seen a clear case of eliminative Gestalts. The authors discussed this finding in terms of explaining away the incoming sensory information in lower areas through cortical feedback from higher areas, as postulated by predictive coding models of vision, discussed further below. Whether this interpretation is appropriate is not clear yet. In a later replication study (de-wit et al., 2012) we have shown that the reduction of activity in V1 is global, not retinotopically specific, which seems to argue against the original claim. 40

41 Another nice example of an eliminative Gestalt (at the behavioral level) is the phenomenon of motion silencing (Suchow & Alvarez, 2011), where properties of the local elements (like color, size, and shape) belonging to a ring are much harder to detect when the ring is rotating back and forth. Suchow and Alvarez explained this effect as follows: Because a fast-moving object spends only little time at any one location, a local detector is afforded only a brief window in which to assess the changing object. In a follow-up study (Poljac, de-wit, & Wagemans, 2012) with upright and inverted confetti point-light walkers (with the points replaced by colored dots), we have demonstrated that the degree of suppression really depends on the degree of objecthood, or goodness of the overall configuration, more or less independent of motion (see also Peirce, 2013), and therefore arguing against the more low-level explanation in terms of the local mechanisms with small RFs. Whenever a good whole is formed, the details of the parts are fundamentally less accessible to conscious perception, which is also the essence of the phenomenon of embedded figures (see above). In sum, the visual system appears to have developed flexible mechanisms with different characteristics. I think it is useful to distinguish at least three: (1) low-level Gestalts = wholes that are encoded in low-level areas (probably depending on feedback), (2) preservative high-level Gestalts = wholes that are encoded in high-level areas, while their parts are being preserved in low-level areas, and (3) eliminative high-level Gestalts = wholes that are encoded in high-level areas, while their parts are being suppressed in low-level areas. Hence, it is clear that not all Gestalts are equal. Of course, further research is needed to establish the specific properties of these cases (computational reasons, boundary conditions, etc.), but I believe one should at least avoid talking about Gestalts in general without specifying what type of Gestalt one is referring to. Reconsidering Perceptual Grouping in the Context of Perceptual Organization Perceptual grouping and figure-ground organization Perceptual grouping and figure-ground organization, although intimately connected, are not the same process. Perceptual grouping is concerned with the binding together of elements that are disjointed at the level of the proximal stimulus (retinal images). Often, but not always, grouping also entails its complement leaving out some elements as noise or background elements, not selected for further processing. However, this does not mean that the group of selected and grouped elements automatically gets figural status and that the group of non-selected, non-grouped elements becomes a background which continues behind the first group. This seems to require special conditions: phenomenal figures have boundary lines even when the corresponding objective figures have none. A good figure is always a closed figure, which the boundary line has the function of closing (Koffka, 1922, p. 14). Hence, it is clear that groups do not necessarily obey the same Gestalt properties as figures and that grouping does not necessarily behave according to the same principles as figure-ground assignment. It would be interesting to focus more on the similarities and differences between the properties of groups and figures and to characterize them better, for instance, on a graded continuum from weak to strong Gestalts, depending on the mutual relationships between the parts and the wholes or on how linear or nonlinear their underlying processes are (see also Kubilius, Baeck, Wagemans, & Op de Beeck, 2015). There is some overlap in the list of factors determining grouping and figure-ground assignment but others apply to only one of the two forms of organization. There is a clear need for a systematic analysis of the common factors 41

42 (and whether they are common because they affect the same component process, or because the same factor just happens to influence two independent processes in the same way), as well as of the organization-specific factors (and whether this specificity is due to a major functional difference or is merely a side effect of task demands). In general, a thorough examination of the specific task requirements induced by the stimulus and imposed by the instructions is needed to be able to determine the processes involved and the potential generalization beyond the test conditions. For instance, in the research aimed at the quantification of grouping by proximity summarized above, dot lattices are used, and observers are asked to indicate in which orientation they see the linear arrangements of dots. Stimuli are highly regular, percepts are multi-stable (near equilibrium), and phenomenal reports are asked. Grouping involves all elements here, and the selection is at the level of percepts. When one orientation is seen, the others are still present in the stimulus. It is probably the noise in the visual system (i.e., internal noise) that causes switching from one percept to another. The situation is quite different in research aimed at the quantification of good continuation, in which random arrays of Gabor elements are mostly used, and observers are asked to detect or locate the target group ( snake ) embedded in a background of noise elements. Here, noise is present in the stimulus (i.e., external noise), and target elements must be selected for proper grouping. A participant s response can be regarded as correct or incorrect, relative to the intended target group, although it is always possible that an observer truly sees a (spurious) group in the background elements, leading to a false alarm or mislocalization. How can we expect grouping principles to generalize between two such fundamentally different situations? Further progress with respect to theoretical integration will depend on experiments that bridge the gaps between different experimental paradigms, starting from analyses such as the above. A similar recommendation applies to the connection between contour grouping, contour integration, contour completion, and the like. Stimulus and task differences complicate a theoretical synthesis. A major limitation of these studies is that they usually do not deal with contours in the sense of boundary lines with the function of closing an area or region (Kogo & Wagemans, 2013a, 2013b). A snake in a Gabor array is a curvilinear group, it is not a contour, nor a boundary of a figure or an object (although the name itself accidentally refers to some object). The literature on perceptual grouping in the context of interrupted or noisy contours in real images would be much more directly relevant to figure-ground organization if their snake stimuli were to be supplemented with curved groups that have more potential as boundaries of surfaces. We have been pursuing an extensive research program using this approach (e.g., Machilsen et al., 2009; Machilsen & Wagemans, 2011; Mijovic et al., 2014; Nygård, Van Looy, & Wagemans, 2009; Sassi et al., 2010, 2012; Vancleef & Wagemans, 2013; Vancleef et al., 2013a; Vancleef et al., 2013b) but more work is clearly needed. Figure-ground organization in context The same holds true in the other direction as well: Figure-ground organization could be related more strongly to perceptual grouping. Progress regarding figure-ground organization could also profit from a more fine-grained analysis of the different components involved. One is the process of segregating regions (based on the relative similarity within a region/group and dissimilarity between different regions/groups), but this can also include the specification of the contour with all of its relevant geometric properties (incl. grouping the contour fragments or linking the multiple borderline signals at different locations in the visual field) and the specification of the relationships to relevant 42

43 properties of the configuration within which the contour is embedded. An additional process is figure-ground assignment, which includes the integration of multiple border-ownership signals and the overall border-ownership assignment. Moreover, such an interplay must be embedded into a dynamic system with cooperative and competitive units, with its own proper balance between deterministic and stochastic characteristics, to allow for perceptual switching in cases of multistability (Kogo, Galli, & Wagemans, 2011). Kogo and van Ee (2015) have integrated this literature in a recent review chapter, which I will summarize briefly here, to illustrate the progress that is possible by putting the pieces of the puzzle together in a broader context. In essence, figure-ground organization requires the computation of depth order at the borders between abutting regions in the projection of the scene on the retina. The border-ownership (or BOWN) is assigned to the surface that is closer to the viewer, consistent with the border being perceived as the edge of the surface. Note that border signals and BOWN signals have fundamentally different properties: The border signals indicate solely the existence of the border, while the BOWN signals specify the polarity associated with the owner side of the border. The laboratory of von der Heydt was the first to demonstrate that neural activity associated with border-ownership is present in macaque visual cortex (Zhou, Friedman, & von der Heydt, 2000). With single-unit recordings, they first specified the RF size, as well as the orientation tuning of neurons in V1, V2, and V4. They then presented images such as those shown in Figure 14, in such a way that a region border covered the RF and matched the preferred orientation of the neuron. While they kept the geometrical properties within the RF exactly the same, they modified the global context (see Figure 14, top panels). In panels 1 and 2, the edge is a transition from white to grey (from left to right), while in panels 3 and 4, the edge has an opposite luminance polarity (i.e., from grey to white). Within each contrast polarity, however, the perceived figure-ground relationship differs (and thus the BOWN as well). In panel 1, for example, we perceive a white square on top of a grey background, while in panel 2, we see a grey square on top of a white background. In other words, although the local properties within the RF are the same, the perceived ownership of the border is reversed. A similar reversal of BOWN occurs for panels 3 and 4, with a perceived grey and white square, respectively. Crucially, the responses of the neurons were consistently associated with the preference of the perceived owner side of the border (see Figure 14, bottom panels). Moreover, the proportion of orientation-sensitive neurons that was also BOWN-sensitive was higher in V2 (59%) and V4 (53%) than in V1 (18%), suggesting hierarchical processing. 43

44 Figure 14. Border-ownership coding in V2 cells, independent of contrast polarity (adapted from Zhou et al., 2000). If these neurons are truly the neuronal entities involved in BOWN computations, they should be strongly involved in depth perception. A follow-up study from the same lab (Qiu & von der Heydt, 2005) investigated this and found that 21% of neurons in V2 (and 3% in V1) exhibited responses that were consistently tuned to the depth-order based on contextual figure-ground cues and the depthorder based on stereo-disparity cues. The onset latency of the BOWN-sensitive component of the responses was also extremely short (75 ms from stimulus-onset) and did not differ much between a small and a large rectangle (Sugihara, Qiu, & von der Heydt, 2011). The context-sensitive nature of BOWN implies that the underlying neural mechanisms must involve global interactions, which in turn implies that the signals must travel a long distance within an extremely short period. These aspects provide important constraints for developing neural models because the fast signal processing in the BOWN computation cannot be explained by horizontal connections (Craft et al., 2007; Sugihara et al., 2011; Zhang & von der Heydt, 2010; Zhou et al., 2000). Instead, the global interactions in the BOWNcomputation are most likely achieved by feedforward-feedback loops. Such loops are physiologically realistic because it has been shown that the feedforward-feedback connections involve myelinated axons with conduction velocity that are about ten times faster than the horizontal connections. In addition, if the signals are conducted vertically between layers, the size of the figural surfaces would have less influence on the conduction distances. Based on this analysis, von der Heydt and colleagues proposed that the collective BOWN signals that are consistent with the presence of convex, enclosed contours activate a hypothetical grouping cell at a higher processing level (V4 and above), and that 44

GESTALT PSYCHOLOGY AND

GESTALT PSYCHOLOGY AND GESTALT PSYCHOLOGY AND CONTEMPORARY VISION SCIENCE: PROBLEMS, CHALLENGES, PROSPECTS JOHAN WAGEMANS LABORATORY OF EXPERIMENTAL PSYCHOLOGY UNIVERSITY OF LEUVEN, BELGIUM GTA 2013, KARLSRUHE A century of Gestalt

More information

Principals of Object Perception

Principals of Object Perception Principals of Object Perception Elizabeth S. Spelke COGNITIVE SCIENCE 14, 29-56 (1990) Cornell University Summary Infants perceive object by analyzing tree-dimensional surface arrangements and motions.

More information

Using Perceptual Grouping for Object Group Selection

Using Perceptual Grouping for Object Group Selection Using Perceptual Grouping for Object Group Selection Hoda Dehmeshki Department of Computer Science and Engineering, York University, 4700 Keele Street Toronto, Ontario, M3J 1P3 Canada hoda@cs.yorku.ca

More information

Perceptual Organization (II)

Perceptual Organization (II) (II) Introduction to Computational and Biological Vision CS 202-1-5261 Computer Science Department, BGU Ohad Ben-Shahar Why do things look they way they do? [Koffka 1935] External (Environment) vs. Internal

More information

Announcements. Perceptual Grouping. Quiz: Fourier Transform. What you should know for quiz. What you should know for quiz

Announcements. Perceptual Grouping. Quiz: Fourier Transform. What you should know for quiz. What you should know for quiz Announcements Quiz on Tuesday, March 10. Material covered (Union not Intersection) All lectures before today (March 3). Forsyth and Ponce Readings: Chapters 1.1, 4, 5.1, 5.2, 5.3, 7,8, 9.1, 9.2, 9.3, 6.5.2,

More information

Visual Perception 6. Daniel Chandler. The innocent eye is blind and the virgin mind empty. - Nelson Goodman. Gestalt Principles of Visual Organization

Visual Perception 6. Daniel Chandler. The innocent eye is blind and the virgin mind empty. - Nelson Goodman. Gestalt Principles of Visual Organization Visual Perception 6 Daniel Chandler The innocent eye is blind and the virgin mind empty. - Nelson Goodman Gestalt Principles of Visual Organization In discussing the 'selectivity' of perception I have

More information

Chapter 5: Perceiving Objects and Scenes

Chapter 5: Perceiving Objects and Scenes Chapter 5: Perceiving Objects and Scenes The Puzzle of Object and Scene Perception The stimulus on the receptors is ambiguous. Inverse projection problem: An image on the retina can be caused by an infinite

More information

Chapter 5: Perceiving Objects and Scenes

Chapter 5: Perceiving Objects and Scenes PSY382-Hande Kaynak, PhD 2/13/17 Chapter 5: Perceiving Objects and Scenes 1 2 Figure 5-1 p96 3 Figure 5-2 p96 4 Figure 5-4 p97 1 Why Is It So Difficult to Design a Perceiving Machine? The stimulus on the

More information

9.65 Sept. 12, 2001 Object recognition HANDOUT with additions in Section IV.b for parts of lecture that were omitted.

9.65 Sept. 12, 2001 Object recognition HANDOUT with additions in Section IV.b for parts of lecture that were omitted. 9.65 Sept. 12, 2001 Object recognition HANDOUT with additions in Section IV.b for parts of lecture that were omitted. I. Why is visual perception difficult? II. Basics of visual perception A. Gestalt principles,

More information

Computational Architectures in Biological Vision, USC, Spring 2001

Computational Architectures in Biological Vision, USC, Spring 2001 Computational Architectures in Biological Vision, USC, Spring 2001 Lecture 11: Visual Illusions. Reading Assignments: None 1 What Can Illusions Teach Us? They exacerbate the failure modes of our visual

More information

What is mid level vision? Mid Level Vision. What is mid level vision? Lightness perception as revealed by lightness illusions

What is mid level vision? Mid Level Vision. What is mid level vision? Lightness perception as revealed by lightness illusions What is mid level vision? Mid Level Vision March 18, 2004 Josh McDermott Perception involves inferring the structure of the world from measurements of energy generated by the world (in vision, this is

More information

SYMPOSIUM EXPERIMENTAL METHODS IN PERCEPTUAL ORGANIZATION

SYMPOSIUM EXPERIMENTAL METHODS IN PERCEPTUAL ORGANIZATION SYMPOSIUM EXPERIMENTAL METHODS IN PERCEPTUAL ORGANIZATION TEAP 2012, MANNHEIM TUESDAY APRIL 3, ROOM 0 148, 9:00-11:00 Symposium overview Convenors: Tandra Ghose & Johan Wagemans 09:00 Johan Wagemans (Leuven,

More information

PSY 310: Sensory and Perceptual Processes 1

PSY 310: Sensory and Perceptual Processes 1 Wilhelm Wundt Gestalt Psychology PSY 310 Established the first true psychology laboratory in 1879 University of Leipzig (Germany) Greg Francis Tried to identify basic elements of perception Similar to

More information

Psychology of visual perception C O M M U N I C A T I O N D E S I G N, A N I M A T E D I M A G E 2014/2015

Psychology of visual perception C O M M U N I C A T I O N D E S I G N, A N I M A T E D I M A G E 2014/2015 Psychology of visual perception C O M M U N I C A T I O N D E S I G N, A N I M A T E D I M A G E 2014/2015 EXTENDED SUMMARY Lesson #4: Oct. 13 th 2014 Lecture plan: GESTALT PSYCHOLOGY Nature and fundamental

More information

Natural Scene Statistics and Perception. W.S. Geisler

Natural Scene Statistics and Perception. W.S. Geisler Natural Scene Statistics and Perception W.S. Geisler Some Important Visual Tasks Identification of objects and materials Navigation through the environment Estimation of motion trajectories and speeds

More information

(Visual) Attention. October 3, PSY Visual Attention 1

(Visual) Attention. October 3, PSY Visual Attention 1 (Visual) Attention Perception and awareness of a visual object seems to involve attending to the object. Do we have to attend to an object to perceive it? Some tasks seem to proceed with little or no attention

More information

Today: Visual perception, leading to higher-level vision: object recognition, word perception.

Today: Visual perception, leading to higher-level vision: object recognition, word perception. 9.65 - Cognitive Processes - Spring 2004 MIT Department of Brain and Cognitive Sciences Course Instructor: Professor Mary C. Potter 9.65 February 9, 2004 Object recognition HANDOUT I. Why is object recognition

More information

Goodness of Pattern and Pattern Uncertainty 1

Goodness of Pattern and Pattern Uncertainty 1 J'OURNAL OF VERBAL LEARNING AND VERBAL BEHAVIOR 2, 446-452 (1963) Goodness of Pattern and Pattern Uncertainty 1 A visual configuration, or pattern, has qualities over and above those which can be specified

More information

Gestalt theories of perception

Gestalt theories of perception Gestalt theories of perception THE MOST IMPORTANT LECTURE YOU WILL EVER ATTEND!!!!! Talk about the journey to this point GESTALT PRINCIPLES Gestalt psychology Gestalt psychology was founded in 1910 by

More information

Grouping by similarity is mediated by feature selection: evidence from the failure of cue combination

Grouping by similarity is mediated by feature selection: evidence from the failure of cue combination Psychon Bull Rev (2015) 22:1364 1369 DOI 10.3758/s13423-015-0801-z BRIEF REPORT Grouping by similarity is mediated by feature selection: evidence from the failure of cue combination Liqiang Huang Published

More information

THE ENCODING OF PARTS AND WHOLES

THE ENCODING OF PARTS AND WHOLES THE ENCODING OF PARTS AND WHOLES IN THE VISUAL CORTICAL HIERARCHY JOHAN WAGEMANS LABORATORY OF EXPERIMENTAL PSYCHOLOGY UNIVERSITY OF LEUVEN, BELGIUM DIPARTIMENTO DI PSICOLOGIA, UNIVERSITÀ DI MILANO-BICOCCA,

More information

Intelligent Object Group Selection

Intelligent Object Group Selection Intelligent Object Group Selection Hoda Dehmeshki Department of Computer Science and Engineering, York University, 47 Keele Street Toronto, Ontario, M3J 1P3 Canada hoda@cs.yorku.ca Wolfgang Stuerzlinger,

More information

Visual Design. Simplicity, Gestalt Principles, Organization/Structure

Visual Design. Simplicity, Gestalt Principles, Organization/Structure Visual Design Simplicity, Gestalt Principles, Organization/Structure Many examples are from Universal Principles of Design, Lidwell, Holden, and Butler 1 Why discuss visual design? You need to present

More information

The Structuralist Approach

The Structuralist Approach The Structuralist Approach Approach established by Wundt (1830-1920) States that perceptions are created by combining elements called sensations Popular in mid to late 19 th century Wundt studied conscious

More information

Are there Hemispheric Differences in Visual Processes that Utilize Gestalt Principles?

Are there Hemispheric Differences in Visual Processes that Utilize Gestalt Principles? Carnegie Mellon University Research Showcase @ CMU Dietrich College Honors Theses Dietrich College of Humanities and Social Sciences 2006 Are there Hemispheric Differences in Visual Processes that Utilize

More information

Lecture 2.1 What is Perception?

Lecture 2.1 What is Perception? Lecture 2.1 What is Perception? A Central Ideas in Perception: Perception is more than the sum of sensory inputs. It involves active bottom-up and topdown processing. Perception is not a veridical representation

More information

Sensation & Perception PSYC420 Thomas E. Van Cantfort, Ph.D.

Sensation & Perception PSYC420 Thomas E. Van Cantfort, Ph.D. Sensation & Perception PSYC420 Thomas E. Van Cantfort, Ph.D. Objects & Forms When we look out into the world we are able to see things as trees, cars, people, books, etc. A wide variety of objects and

More information

B.A. II Psychology - Paper A. Form Perception. Dr. Neelam Rathee. Department of Psychology G.C.G.-11, Chandigarh

B.A. II Psychology - Paper A. Form Perception. Dr. Neelam Rathee. Department of Psychology G.C.G.-11, Chandigarh B.A. II Psychology - Paper A Form Perception Dr. Neelam Rathee Department of Psychology G.C.G.-11, Chandigarh Form Perception What it is? How do we recognize an object? (form perception) 2 Perception of

More information

Perceptual Organization and Pattern Recognition. Lecture 15

Perceptual Organization and Pattern Recognition. Lecture 15 Perceptual Organization and Pattern Recognition Lecture 15 1 Gibson s Ecological View Direct Perception All information needed for perception is supplied by the stimulus Perceptual systems evolved to extract

More information

Perception. Chapter 8, Section 3

Perception. Chapter 8, Section 3 Perception Chapter 8, Section 3 Principles of Perceptual Organization The perception process helps us to comprehend the confusion of the stimuli bombarding our senses Our brain takes the bits and pieces

More information

IAT 355 Perception 1. Or What You See is Maybe Not What You Were Supposed to Get

IAT 355 Perception 1. Or What You See is Maybe Not What You Were Supposed to Get IAT 355 Perception 1 Or What You See is Maybe Not What You Were Supposed to Get Why we need to understand perception The ability of viewers to interpret visual (graphical) encodings of information and

More information

Theoretical Neuroscience: The Binding Problem Jan Scholz, , University of Osnabrück

Theoretical Neuroscience: The Binding Problem Jan Scholz, , University of Osnabrück The Binding Problem This lecture is based on following articles: Adina L. Roskies: The Binding Problem; Neuron 1999 24: 7 Charles M. Gray: The Temporal Correlation Hypothesis of Visual Feature Integration:

More information

Competing Frameworks in Perception

Competing Frameworks in Perception Competing Frameworks in Perception Lesson II: Perception module 08 Perception.08. 1 Views on perception Perception as a cascade of information processing stages From sensation to percept Template vs. feature

More information

Competing Frameworks in Perception

Competing Frameworks in Perception Competing Frameworks in Perception Lesson II: Perception module 08 Perception.08. 1 Views on perception Perception as a cascade of information processing stages From sensation to percept Template vs. feature

More information

Traditional and new principles of perceptual grouping Joseph L. Brooks, School of Psychology, University of Kent, UK

Traditional and new principles of perceptual grouping Joseph L. Brooks, School of Psychology, University of Kent, UK Traditional and new principles of perceptual grouping Joseph L. Brooks, School of Psychology, University of Kent, UK To appear in: Oxford Handbook of Perceptual Organization Oxford University Press Edited

More information

Note:- Receptors are the person who receives any images from outer environment.

Note:- Receptors are the person who receives any images from outer environment. Concept According to Oxford Advanced Learner s Dictionary, Perception means the way you notice things especially with the senses. Perception is the process of organizing and attempting to understand the

More information

Gestalt Principles of Grouping

Gestalt Principles of Grouping Gestalt Principles of Grouping Ch 4C depth and gestalt 1 There appears to be some inherent cognitive process to organize information in a simple manner (nativist perspective). Without some sort of mental

More information

Sensation vs. Perception

Sensation vs. Perception PERCEPTION Sensation vs. Perception What s the difference? Sensation what the senses do Perception process of recognizing, organizing and dinterpreting ti information. What is Sensation? The process whereby

More information

Visual Design: Perception Principles. ID 405: Human-Computer Interaction

Visual Design: Perception Principles. ID 405: Human-Computer Interaction Visual Design: Perception Principles ID 405: Human-Computer Interaction Visual Design: Perception Principles 1. Gestalt psychology of perceptual organisation 2. Perception Principles by V.S. Ramachandran

More information

Fundamentals of Psychophysics

Fundamentals of Psychophysics Fundamentals of Psychophysics John Greenwood Department of Experimental Psychology!! NEUR3045! Contact: john.greenwood@ucl.ac.uk 1 Visual neuroscience physiology stimulus How do we see the world? neuroimaging

More information

Information Processing During Transient Responses in the Crayfish Visual System

Information Processing During Transient Responses in the Crayfish Visual System Information Processing During Transient Responses in the Crayfish Visual System Christopher J. Rozell, Don. H. Johnson and Raymon M. Glantz Department of Electrical & Computer Engineering Department of

More information

Computer Vision. Gestalt Theory. Gestaltism. Gestaltism. Computer Science Tripos Part II. Dr Christopher Town. Principles of Gestalt Theory

Computer Vision. Gestalt Theory. Gestaltism. Gestaltism. Computer Science Tripos Part II. Dr Christopher Town. Principles of Gestalt Theory A B C Computer Vision Computer Science Tripos Part II Dr Christopher Town A B Gestalt Theory D C Gestalt: a meaningful whole or group Whole is greater than the sum of its parts Relationships among parts

More information

= add definition here. Definition Slide

= add definition here. Definition Slide = add definition here Definition Slide Definition Slides Sensation = the process by which our sensory receptors and nervous system receive and represent stimulus energies from our environment. Perception

More information

Object vision (Chapter 4)

Object vision (Chapter 4) Object vision (Chapter 4) Lecture 8 Jonathan Pillow Sensation & Perception (PSY 345 / NEU 325) Princeton University, Spring 2015 1 Outline for today: Chap 3: adaptation Chap 4: intro to object vision gestalt

More information

Understanding Users. - cognitive processes. Unit 3

Understanding Users. - cognitive processes. Unit 3 Understanding Users - cognitive processes Unit 3 Why do we need to understand users? Interacting with technology involves a number of cognitive processes We need to take into account Characteristic & limitations

More information

JUDGMENTAL MODEL OF THE EBBINGHAUS ILLUSION NORMAN H. ANDERSON

JUDGMENTAL MODEL OF THE EBBINGHAUS ILLUSION NORMAN H. ANDERSON Journal of Experimental Psychology 1971, Vol. 89, No. 1, 147-151 JUDGMENTAL MODEL OF THE EBBINGHAUS ILLUSION DOMINIC W. MASSARO» University of Wisconsin AND NORMAN H. ANDERSON University of California,

More information

Framework for Comparative Research on Relational Information Displays

Framework for Comparative Research on Relational Information Displays Framework for Comparative Research on Relational Information Displays Sung Park and Richard Catrambone 2 School of Psychology & Graphics, Visualization, and Usability Center (GVU) Georgia Institute of

More information

The Standard Theory of Conscious Perception

The Standard Theory of Conscious Perception The Standard Theory of Conscious Perception C. D. Jennings Department of Philosophy Boston University Pacific APA 2012 Outline 1 Introduction Motivation Background 2 Setting up the Problem Working Definitions

More information

Perceptual Grouping: It s Later Than You Think. Stephen E. Palmer 1

Perceptual Grouping: It s Later Than You Think. Stephen E. Palmer 1 Perceptual Grouping: It s Later Than You Think Stephen E. Palmer 1 Psychology Department, University of California, Berkeley, California Abstract Recent research on perceptual grouping is described with

More information

Choose an approach for your research problem

Choose an approach for your research problem Choose an approach for your research problem This course is about doing empirical research with experiments, so your general approach to research has already been chosen by your professor. It s important

More information

Local Image Structures and Optic Flow Estimation

Local Image Structures and Optic Flow Estimation Local Image Structures and Optic Flow Estimation Sinan KALKAN 1, Dirk Calow 2, Florentin Wörgötter 1, Markus Lappe 2 and Norbert Krüger 3 1 Computational Neuroscience, Uni. of Stirling, Scotland; {sinan,worgott}@cn.stir.ac.uk

More information

Definition Slides. Sensation. Perception. Bottom-up processing. Selective attention. Top-down processing 11/3/2013

Definition Slides. Sensation. Perception. Bottom-up processing. Selective attention. Top-down processing 11/3/2013 Definition Slides Sensation = the process by which our sensory receptors and nervous system receive and represent stimulus energies from our environment. Perception = the process of organizing and interpreting

More information

Is Cognitive Science Special? In what way is it special? Cognitive science is a delicate mixture of the obvious and the incredible

Is Cognitive Science Special? In what way is it special? Cognitive science is a delicate mixture of the obvious and the incredible Sept 3, 2013 Is Cognitive Science Special? In what way is it special? Zenon Pylyshyn, Rutgers Center for Cognitive Science Cognitive science is a delicate mixture of the obvious and the incredible What

More information

Presence and Perception: theoretical links & empirical evidence. Edwin Blake

Presence and Perception: theoretical links & empirical evidence. Edwin Blake Presence and Perception: theoretical links & empirical evidence Edwin Blake edwin@cs.uct.ac.za This Talk 2 Perception Bottom-up Top-down Integration Presence Bottom-up Top-down BIPs Presence arises from

More information

Audio: In this lecture we are going to address psychology as a science. Slide #2

Audio: In this lecture we are going to address psychology as a science. Slide #2 Psychology 312: Lecture 2 Psychology as a Science Slide #1 Psychology As A Science In this lecture we are going to address psychology as a science. Slide #2 Outline Psychology is an empirical science.

More information

COGS 121 HCI Programming Studio. Week 03

COGS 121 HCI Programming Studio. Week 03 COGS 121 HCI Programming Studio Week 03 Direct Manipulation Principles of Direct Manipulation 1. Continuous representations of the objects and actions of interest with meaningful visual metaphors. 2. Physical

More information

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing

SUPPLEMENTARY INFORMATION. Table 1 Patient characteristics Preoperative. language testing Categorical Speech Representation in the Human Superior Temporal Gyrus Edward F. Chang, Jochem W. Rieger, Keith D. Johnson, Mitchel S. Berger, Nicholas M. Barbaro, Robert T. Knight SUPPLEMENTARY INFORMATION

More information

Perceiving Objects Different Approaches

Perceiving Objects Different Approaches Chapter 4 Perceiving Objects Different Approaches Molecules Neurons Circuits & Brain Areas Brain Physiological Approach Individual Features Groups of Features Objects Scenes Psychophysical Approach Figure

More information

Intelligent Mouse-Based Object Group Selection

Intelligent Mouse-Based Object Group Selection Intelligent Mouse-Based Object Group Selection Hoda Dehmeshki and Wolfgang Stuerzlinger Department of Computer Science and Engineering York University, Toronto, Canada Abstract. Modern graphical user interfaces

More information

Lecturer: Rob van der Willigen 11/9/08

Lecturer: Rob van der Willigen 11/9/08 Auditory Perception - Detection versus Discrimination - Localization versus Discrimination - - Electrophysiological Measurements Psychophysical Measurements Three Approaches to Researching Audition physiology

More information

Lecturer: Rob van der Willigen 11/9/08

Lecturer: Rob van der Willigen 11/9/08 Auditory Perception - Detection versus Discrimination - Localization versus Discrimination - Electrophysiological Measurements - Psychophysical Measurements 1 Three Approaches to Researching Audition physiology

More information

Sensation and Perception

Sensation and Perception 1 Sensation and Perception DR. ARNEL BANAGA SALGADO, Doctor of Psychology (USA) FPM (Ph.D.) Psychology (India) Doctor of Education (Phl) Master of Arts in Nursing (Phl) Master of Arts in Teaching Psychology

More information

EDGE DETECTION. Edge Detectors. ICS 280: Visual Perception

EDGE DETECTION. Edge Detectors. ICS 280: Visual Perception EDGE DETECTION Edge Detectors Slide 2 Convolution & Feature Detection Slide 3 Finds the slope First derivative Direction dependent Need many edge detectors for all orientation Second order derivatives

More information

PSY380: VISION SCIENCE

PSY380: VISION SCIENCE PSY380: VISION SCIENCE 1) Questions: - Who are you and why are you here? (Why vision?) - What is visual perception? - What is the function of visual perception? 2) The syllabus & instructor 3) Lecture

More information

Cognitive issues in visual perception

Cognitive issues in visual perception Cognitive issues in visual perception 1 Our perception of a visual stimulus depends Not only on what we see But also on how we interpret it 2 3 From seeing to perceiving Perception, interpretation, & comprehension

More information

HARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT

HARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT HARRISON ASSESSMENTS HARRISON ASSESSMENTS DEBRIEF GUIDE 1. OVERVIEW OF HARRISON ASSESSMENT Have you put aside an hour and do you have a hard copy of your report? Get a quick take on their initial reactions

More information

Sperling conducted experiments on An experiment was conducted by Sperling in the field of visual sensory memory.

Sperling conducted experiments on An experiment was conducted by Sperling in the field of visual sensory memory. Levels of category Basic Level Category: Subordinate Category: Superordinate Category: Stages of development of Piaget 1. Sensorimotor stage 0-2 2. Preoperational stage 2-7 3. Concrete operational stage

More information

Journal of Experimental Psychology: Human Perception and Performance

Journal of Experimental Psychology: Human Perception and Performance Journal of Experimental Psychology: Human Perception and Performance VOL. I I, NO. 6 DECEMBER 1985 Separability and Integrality of Global and Local Levels of Hierarchical Patterns Ruth Kimchi University

More information

Cognitive Processes PSY 334. Chapter 2 Perception

Cognitive Processes PSY 334. Chapter 2 Perception Cognitive Processes PSY 334 Chapter 2 Perception Object Recognition Two stages: Early phase shapes and objects are extracted from background. Later phase shapes and objects are categorized, recognized,

More information

VISUAL PERCEPTION OF STRUCTURED SYMBOLS

VISUAL PERCEPTION OF STRUCTURED SYMBOLS BRUC W. HAMILL VISUAL PRCPTION OF STRUCTURD SYMBOLS A set of psychological experiments was conducted to explore the effects of stimulus structure on visual search processes. Results of the experiments,

More information

1.1 FEATURES OF THOUGHT

1.1 FEATURES OF THOUGHT SEC 1 Page 1 of 7 1.1 FEATURES OF THOUGHT Thought can refer to the ideas or arrangements of ideas that result from thinking, the act of producing thoughts, or the process of producing thoughts. Despite

More information

H.O.T. Theory, Concepts, and Synesthesia: A Reply to Adams and Shreve

H.O.T. Theory, Concepts, and Synesthesia: A Reply to Adams and Shreve H.O.T. Theory, Concepts, and Synesthesia: A Reply to Adams and Shreve Rocco J. Gennaro Abstract: In response to Fred Adams and Charlotte Shreve s (2016) paper entitled What Can Synesthesia Teach Us about

More information

Sensation is the conscious experience associated with an environmental stimulus. It is the acquisition of raw information by the body s sense organs

Sensation is the conscious experience associated with an environmental stimulus. It is the acquisition of raw information by the body s sense organs Sensation is the conscious experience associated with an environmental stimulus. It is the acquisition of raw information by the body s sense organs Perception is the conscious experience of things and

More information

Visual Selection and Attention

Visual Selection and Attention Visual Selection and Attention Retrieve Information Select what to observe No time to focus on every object Overt Selections Performed by eye movements Covert Selections Performed by visual attention 2

More information

The Perceptual Experience

The Perceptual Experience Dikran J. Martin Introduction to Psychology Name: Date: Lecture Series: Chapter 5 Sensation and Perception Pages: 35 TEXT: Lefton, Lester A. and Brannon, Linda (2003). PSYCHOLOGY. (Eighth Edition.) Needham

More information

Lesson 5 Sensation, Perception, Memory, and The Conscious Mind

Lesson 5 Sensation, Perception, Memory, and The Conscious Mind Lesson 5 Sensation, Perception, Memory, and The Conscious Mind Introduction: Connecting Your Learning The beginning of Bloom's lecture concludes his discussion of language development in humans and non-humans

More information

Object recognition and hierarchical computation

Object recognition and hierarchical computation Object recognition and hierarchical computation Challenges in object recognition. Fukushima s Neocognitron View-based representations of objects Poggio s HMAX Forward and Feedback in visual hierarchy Hierarchical

More information

Free classification: Element-level and subgroup-level similarity

Free classification: Element-level and subgroup-level similarity Perception & Psychophysics 1980,28 (3), 249-253 Free classification: Element-level and subgroup-level similarity STEPHEN HANDEL and JAMES W. RHODES University oftennessee, Knoxville, Tennessee 37916 Subjects

More information

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination

Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Hierarchical Bayesian Modeling of Individual Differences in Texture Discrimination Timothy N. Rubin (trubin@uci.edu) Michael D. Lee (mdlee@uci.edu) Charles F. Chubb (cchubb@uci.edu) Department of Cognitive

More information

Answers to end of chapter questions

Answers to end of chapter questions Answers to end of chapter questions Chapter 1 What are the three most important characteristics of QCA as a method of data analysis? QCA is (1) systematic, (2) flexible, and (3) it reduces data. What are

More information

Neural Correlation of Global-first Topological Perception: Anterior Temporal Lobe

Neural Correlation of Global-first Topological Perception: Anterior Temporal Lobe Brain Imaging and Behavior (28) 2:39 317 DOI 1.17/s11682-8-939-y Neural Correlation of Global-first Topological Perception: Anterior Temporal Lobe Tiangang Zhou & Jun Zhang & Lin Chen Received: 8 April

More information

The Regression-Discontinuity Design

The Regression-Discontinuity Design Page 1 of 10 Home» Design» Quasi-Experimental Design» The Regression-Discontinuity Design The regression-discontinuity design. What a terrible name! In everyday language both parts of the term have connotations

More information

Fundamentals of Cognitive Psychology, 3e by Ronald T. Kellogg Chapter 2. Multiple Choice

Fundamentals of Cognitive Psychology, 3e by Ronald T. Kellogg Chapter 2. Multiple Choice Multiple Choice 1. Which structure is not part of the visual pathway in the brain? a. occipital lobe b. optic chiasm c. lateral geniculate nucleus *d. frontal lobe Answer location: Visual Pathways 2. Which

More information

The Color of Similarity

The Color of Similarity The Color of Similarity Brooke O. Breaux (bfo1493@louisiana.edu) Institute of Cognitive Science, University of Louisiana at Lafayette, Lafayette, LA 70504 USA Michele I. Feist (feist@louisiana.edu) Institute

More information

Sensation and Perception

Sensation and Perception Sensation and Perception 1 Chapters 4 of the required textbook Introduction to Psychology International Edition bv James Kalat (2010) 9 th Edition EXPECTED LEARNING OUTCOMES After studying this chapter,

More information

Object Substitution Masking: When does Mask Preview work?

Object Substitution Masking: When does Mask Preview work? Object Substitution Masking: When does Mask Preview work? Stephen W. H. Lim (psylwhs@nus.edu.sg) Department of Psychology, National University of Singapore, Block AS6, 11 Law Link, Singapore 117570 Chua

More information

PCT 101. A Perceptual Control Theory Primer. Fred Nickols 8/27/2012

PCT 101. A Perceptual Control Theory Primer. Fred Nickols 8/27/2012 PCT 101 A Perceptual Control Theory Primer Fred Nickols 8/27/2012 This paper presents a simplified, plain language explanation of Perceptual Control Theory (PCT). PCT is a powerful and practical theory

More information

An Exponential Pyramid-Based Model of Contour Classification in Figure-Ground Segregation

An Exponential Pyramid-Based Model of Contour Classification in Figure-Ground Segregation An Exponential Pyramid-Based Model of Contour Classification in Figure-Ground Segregation MICHAEL R. SCHEESSELE Indiana University at South Bend and ZYGMUNT PIZLO Purdue University In a retinal image,

More information

Supplementary Study A: Do the exemplars that represent a category influence IAT effects?

Supplementary Study A: Do the exemplars that represent a category influence IAT effects? Supplement A to Nosek, B. A., Greenwald, A. G., & Banaji, M. R. (2005). Understanding and using the Implicit Association Test: II. Method Variables and Construct Validity. Personality and Social Psychology

More information

Left Handed Split Brain. Learning Objectives Topics

Left Handed Split Brain. Learning Objectives Topics Left Handed Split Brain Case study V.J.: Gazzaniga, 1998 Left handed split brain patient Spoke out of left hemisphere Wrote out of right hemisphere Writing = independent from language systems Frey et al.

More information

Doing High Quality Field Research. Kim Elsbach University of California, Davis

Doing High Quality Field Research. Kim Elsbach University of California, Davis Doing High Quality Field Research Kim Elsbach University of California, Davis 1 1. What Does it Mean to do High Quality (Qualitative) Field Research? a) It plays to the strengths of the method for theory

More information

Rules of apparent motion: The shortest-path constraint: objects will take the shortest path between flashed positions.

Rules of apparent motion: The shortest-path constraint: objects will take the shortest path between flashed positions. Rules of apparent motion: The shortest-path constraint: objects will take the shortest path between flashed positions. The box interrupts the apparent motion. The box interrupts the apparent motion.

More information

M Cells. Why parallel pathways? P Cells. Where from the retina? Cortical visual processing. Announcements. Main visual pathway from retina to V1

M Cells. Why parallel pathways? P Cells. Where from the retina? Cortical visual processing. Announcements. Main visual pathway from retina to V1 Announcements exam 1 this Thursday! review session: Wednesday, 5:00-6:30pm, Meliora 203 Bryce s office hours: Wednesday, 3:30-5:30pm, Gleason https://www.youtube.com/watch?v=zdw7pvgz0um M Cells M cells

More information

Kantor Behavioral Profiles

Kantor Behavioral Profiles Kantor Behavioral Profiles baseline name: date: Kantor Behavioral Profiles baseline INTRODUCTION Individual Behavioral Profile In our earliest social system the family individuals explore a range of behavioral

More information

Morton-Style Factorial Coding of Color in Primary Visual Cortex

Morton-Style Factorial Coding of Color in Primary Visual Cortex Morton-Style Factorial Coding of Color in Primary Visual Cortex Javier R. Movellan Institute for Neural Computation University of California San Diego La Jolla, CA 92093-0515 movellan@inc.ucsd.edu Thomas

More information

Auditory Scene Analysis

Auditory Scene Analysis 1 Auditory Scene Analysis Albert S. Bregman Department of Psychology McGill University 1205 Docteur Penfield Avenue Montreal, QC Canada H3A 1B1 E-mail: bregman@hebb.psych.mcgill.ca To appear in N.J. Smelzer

More information

Dynamics and Modeling in Cognitive Science - I

Dynamics and Modeling in Cognitive Science - I Dynamics and Modeling in Cognitive Science - I Narayanan Srinivasan Centre of Behavioural and Cognitive Sciences University of Allahabad, India Outline General introduction to Cognitive Science Problem

More information

2012 Course : The Statistician Brain: the Bayesian Revolution in Cognitive Science

2012 Course : The Statistician Brain: the Bayesian Revolution in Cognitive Science 2012 Course : The Statistician Brain: the Bayesian Revolution in Cognitive Science Stanislas Dehaene Chair in Experimental Cognitive Psychology Lecture No. 4 Constraints combination and selection of a

More information

Are Retrievals from Long-Term Memory Interruptible?

Are Retrievals from Long-Term Memory Interruptible? Are Retrievals from Long-Term Memory Interruptible? Michael D. Byrne byrne@acm.org Department of Psychology Rice University Houston, TX 77251 Abstract Many simple performance parameters about human memory

More information

The scope of perceptual content, II: properties

The scope of perceptual content, II: properties The scope of perceptual content, II: properties Jeff Speaks November 16, 2009 1 What are the candidates?............................ 1 2 Arguments for inclusion............................. 2 2.1 From

More information