Journal of Experimental Psychology: General 1994, Vol. 123, No. 3, 316-320 Copyright 1994 by the American Psychological Association. Inc. 0096-3445/94/S3.00 COMMENT Grouped Locations and Object-Based Attention: Comment on Egly, Driver, and Rafal (1994) Shaun P. Vecera Recently, R. Egly, J. Driver, and R. D. Rafal (1994) provided evidence for an object-based component of visual orienting in a simple cued reaction time task. However, the effects of objects on visual attention can be due to selection from either of two very different types of representations: (a) a truly object-based representation that codes for object structure or (b) a grouped array representation that codes for groups of spatial locations. Are Egly et al.'s results due to selection from an object-based representation or from a grouped array representation? This question was addressed by using a variant of Egly et al.'s task. The findings replicated those of Egly et al. and demonstrated that the selection in this task is mediated through a grouped array representation. The implications of these results for studies of attentional selection are discussed. In the past, the study of attentional selection has primarily focused on how visual attention selects stimuli on the basis of spatial location in the visual field. However, in recent years there has been an increase in the study of how visual attention selects stimuli on the basis of shape or structure. The former approaches have come to be known as spatial or location-based models of attention, and the latter have come to be known as object-based models of attention. (The literature concerning these two types of selection is quite large and continually growing, so it will not be reviewed here. The reader is referred to the following for reviews and characteristic positions: Duncan, 1984; Egly, Driver, & Rafal, 1994; Eriksen & Eriksen, 1974; Kramer & Jacobson, 1991; Posner, 1980; Posner, Snyder, & Davidson, 1980; Vecera & Farah, 1994) Egly, Driver, and Rafal (1994) used a clever procedure that was reported to show both location-based and objectbased components of visual selection in normal subjects as well as differential impairments in these attentional components in parietal-damaged patients. The results from the normal subjects are of primary interest here. Subjects were shown two rectangles that appeared either above and below fixation or to the left and right of fixation. One of the corners was cued by brightening, and a target followed. Subjects made a simple reaction time (RT) response to the onset of the target, a procedure similar to Posner's classic spatial cuing paradigm (see Posner This work was supported in part by a Sigma Xi Grants-in-Aid of Research. I thank Martha Farah for her comments and encouragement on this work, and I thank Marlene Behrmann, MaryLou Cheal, Roberta Klatzky, David Plaut, and Bob Rafal for further discussion of these ideas. Thanks also to Kendra Gilds for her assistance in running subjects and analyzing data. Correspondence concerning this article should be addressed to Shaun P. Vecera, Department of Psychology, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213-3890. Electronic mail may be sent to vecera+ @cmu.edu. 316 & Cohen, 1984). The cues were either valid (cue and target at same corner) or invalid (cue and target at different corners). Not surprisingly, Egly et al. found a validity effect: Subjects were faster to respond to validly cued trials than to invalidly cued trials. The finding of interest, however, comes from an analysis of the invalid trials. These trials involved movements of attention either within a rectangle or across rectangles. Although the spatial separation of these two conditions was identical, Egly et al. found that subjects were faster to respond to an invalidly cued target when that target was in the rectangle that had previously been cued (a withinrectangle shift) as compared with a target that appeared in the other, uncued rectangle (an across-rectangles shift). Egly et al. referred to this nonspatial component of orienting as object-based attentional selection. They then showed that these two aspects of attentional selection were dissociable in brain-damaged subjects. Patients with damage to the left parietal lobe showed a deficit in moving attention between objects in the contralesional visual field, whereas patients with damage to the right parietal lobe showed a deficit in moving attention between spatial locations in the contralesional field relative to the ipsilesional field. Although Egly et al.'s data convincingly showed a nonspatial aspect of attentional selection, is the term objectbased attention warranted? Recently it has become clear that the effects of objects in the visual field on attentional selection can be explained within a modified location-based framework (see Kramer & Jacobson, 1991; Vecera & Farah, 1994). It might be that these effects are due to selection from a location-based representation in which the locations have been parsed or grouped according to whether the locations belong to one object or another. This grouped array selection would not be object-based selection; rather, it would be a modified location-based selection in which groups of locations are selected. The underlying representational medium would remain spatial (or location-based). Kramer and Jacobson (1991) provided empirical evidence
COMMENT 317 for selection from a grouped array representation. Subjects performed a response competition task in which the target and distractors were embedded in the same object or in different objects. Subjects showed more competition from the distractors when targets were embedded in the same object as the distractors as compared with when the target and distractors were grouped in different objects. However, this effect was reduced if the target and distractors were moved farther away from one another, although grouping within the same object produced more interference than grouping within different objects. That is, the amount of response competition was a function not only of object grouping (same object vs. different object) but also of spatial position (distractors near the target vs. distractors far from the target). Although this grouped array account might explain all of the so-called object-based effects, Vecera and Farah (1994) recently demonstrated that certain types of selection may not be mediated through a grouped array. Using a variant of Duncan's (1984) object-selection task, they found a decrement in selecting from two objects as compared with selecting from a single object (as Duncan originally found), but this effect was independent of whether the objects were spatially adjacent (i.e., superimposed) or spatially separated. This finding suggests that selection in Duncan's task is truly object-based in that object structure, independent of spatial location, is selected. Given the findings of both grouped array selection and object-based selection, an important question is whether Egly et al.'s results were due to selection from a grouped array representation or from an object-based representation. This question is addressed in the experiment below. Egly et al.'s paradigm was used, with one minor modification namely, a spatial manipulation. That is, the corners of the rectangles could be equidistant, as in Egly et al.'s work (the far condition), or the corners of the two different rectangles could be closer than the two comers of a single rectangle (the near condition). The grouped array and object-based accounts make differing predictions as to subjects' performance in these two conditions. If selection in this task is truly object based, then the rectangle itself is being selected. When the cue is invalid, subjects should always be faster if the target is on the same (cued) rectangle (a within-rectangle shift of attention) than when the target is on the other (uncued) rectangle (an across-rectangles shift of attention). This finding would appear as a main effect for type of attention shift (within rectangle vs. across rectangles). The grouped array account, however, suggests that the underlying representation is location based. This account would predict that moving the rectangles closer together should reduce the amount of cost in switching attention from one rectangle to the other. This would appear as an interaction between rectangle separation (near vs. far) and type of attention shift (within rectangle vs. across rectangles). Subjects Method Subjects were 15 Carnegie Mellon University undergraduates. All reported having normal or corrected-to-normal vision. Stimuli The displays used in this experiment were similar to those used by Egly et al. and are shown in Figure 1. Two rectangles appeared A. Large Separation ("Far" Condition) B. Small Separation ("Near" Condition) Horizontal Orientation Vertical Orientation Figure 1. Examples of stimuli used in the experiment.
318 COMMENT either above and below or to the left and right of fixation. There were two amounts of spatial separation between the rectangles; they could be far from one another or they could be near one another. The far condition was a replication of Egly et al.'s experiment, and the near condition was used to test whether selection was modified in part by location. Note that the name of the "far" condition is a bit of a misnomer, because the corners of the rectangles were actually equidistant. Finally, the cues could be either valid (cue and target appearing at same corner) or invalid (cue and target appearing at different corners). Also note that there were two types of invalid cues, those in which the cue and target were within the same rectangle (within-rectangle shifts of attention) and those in which the cue and target were across rectangles (across-rectangle shifts of attention). All figures (e.g., the fixation point and rectangles) were white drawn on a black background (i.e., the inverse of Figure 1). The fixation point measured 0.5 cm X 0.5 cm (0.48 X 0.48 of visual angle, respectively). The horizontal and vertical rectangles were identical except for orientation. Each of the rectangles measured 10 cm X 3.5 cm (9.46 X 3.34 of visual angle). The lines that composed the rectangles were 3 pixels wide. The cue consisted of the brightening of one of the ends of a rectangle. This brightening was achieved by having the width of the line increase from its original 3 pixels to 8 pixels. Note that this cuing procedure is slightly different from the one used by Egly et al., in which the corner switched from a gray line to a white line. This procedural difference does not have any theoretical implications for the present article, assuming that Egly et al.'s results are replicated in the far condition. Finally, the target was a gray box that appeared at one of the corners. The target measured 1.9 cm X 2.3 cm (1.81 X 2.2 of visual angle). When the rectangles were far from one another, the center of the rectangle was 2.95 cm (2.81 ) from the center of fixation. In this condition the four corners of the two rectangles were equidistant from one another; this distance measured 6.1 cm (5.81 of visual angle) from the center of the target in each corner. When the rectangles were near one another, the center of the rectangle was 2.05 cm (1.96 ) from the center of fixation. In this condition the ends of the two rectangles were closer to one another (3.2 cm or 3.05 from the center of the target in each corner) than were the ends of a single rectangle (6.1 cm; see Figure 1). Thus, in the near condition, subjects would move attention a shorter spatial distance when switching from one rectangle to the other, whereas in the far condition the spatial distance was the same when moving either within or across rectangles. Procedure All stimuli were presented on a Macintosh Plus computer. Subjects sat approximately 60 cm from the monitor. Subjects first participated in 80 practice trials in which their eye movements were monitored by the experimenter through a mirror. Subjects were told that they should not make any eye movements. By the end of the practice, subjects made no eye movements. After this practice, subjects received 640 experimental presentations that were divided into 8 blocks of 80 trials each. Subjects were allowed to rest between blocks. An individual trial began with a 1,000 ms fixation display that contained the fixation point and the two rectangles. After this display the cue was presented for 100 ms. The fixation display was then presented for another 200 ms (i.e., the interstimulus interval between the cue and target was 200 ms). Finally, the target appeared and remained visible until the subject responded. Subjects responded by pressing the space bar on a standard keyboard. Half of the subjects responded with their left hand and half responded with their right hand. After each response the screen was blank for 500 ms before the next trial began. The trials were distributed as follows. Half of the time the rectangles were horizontal and half of the time they were vertical. RTs to horizontal and vertical rectangles were averaged because Egly et al. found no effect for horizontal versus vertical presentation. Half of the time the rectangles were far from one another and half of the time they were near one another. Finally, 20% of the 640 trials were catch trials in which no target appeared. On these trials the fixation display followed the cue for 2,000 ms. Subjects were told to withhold responses on the catch trials. They were also asked to say "error" if they made a false alarm; all subjects complied with this request. For the remaining trials in which a target was presented, 75% of the time the cue was valid, and 25% of the time it was invalid. Half of the invalid trials involved shifting attention within the same rectangle and half of the time they involved shifting attention across the two rectangles. Presentation of the trials was random. Results First, any RTs that were over 1,000 ms or less than 100 ms were excluded from the analyses. This trimming eliminated less than 1.5% of all of the data. For each subject the median RT for each condition was calculated; these median RTs were then analyzed with a within-subject analysis of variance (ANOVA). All analyses were conducted using the SuperANOVA software on the Macintosh. The RTs for the far condition were analyzed first. Any subject who did not show faster responses to invalid cues within a rectangle as compared with those across rectangles was excluded from further analyses. This exclusion was done because these subjects showed a spatial selection strategy; that is, they did not show the nonspatial component described by Egly et al. Excluding these subjects should make it more difficult to find spatial effects in the near condition. Three subjects were excluded on the basis of this criterion. These subjects may have explicitly used a spatial strategy to predict the onset of the target; however, it is also possible that there are individual differences in this task. The mean RTs for cue type and spatial separation for 12 subjects appear in Figure 2. Subjects made false alarms on less than 1.5% of the total number of catch trials. As is evident from Figure 2, subjects were faster to respond to validly cued trials as compared with all invalidly cued trials, F( 1,11) = 19.74, p <.001. (Note that the invalid trials have not been separated by within-rectangle vs. across-rectangle shifts. The invalid RTs are due to all invalidly cued targets.) The main effect for spatial separation was not significant, F(l, 11) = 2.17, p >.15, suggesting that the slight spatial distance difference between the near and far conditions did not reliably influence attentional selection (see also Cheal & Lyon, 1989; Reuter-Lorenz & Fendrich, 1992, for consistent findings). Finally, the interaction between these two variables was not significant, F(l, 11) < 1, suggesting that the validity effects in the near and far conditions did not differ significantly from one another. Next, the invalid cues were divided into within-rectangle shifts and across-object shifts. The mean RTs for invalid cue
COMMENT 319 I I 370 Valid Versus Invalid Cues Near Distance Figure 2. Mean reaction times to valid and invalid cues for the near and far conditions. Note that the error bars contain betweensubject variability. type and spatial separation appear in Figure 3. The main effect for the shift of attention was significant, F(l, 11) = 45.33, p <.0001, with within-rectangle shifts resulting in faster RTs than across-rectangle shifts. The main effect for separation was not significant, as in the first analysis, F(l, 11) = 1.37, p >.25. Finally, and most important, the interaction between the two variables was significant, F(l, 11) = 9.50, p <.02, suggesting that the cost in shifting from one rectangle to another was influenced by the separation of the two rectangles. Planned comparisons were conducted on the invalid trials to test within- versus across-rectangle shifts for the near and far conditions separately. For the far condition, the difference was significant, f(ll) = 115.54, p <.0001, replicating Egly et al.'s findings. Similarly, the difference for the near condition was also significant, f(ll) = 45.83, p <.0001. This finding suggests that the structure that the rectangles imposed on the visual field still influenced the allocation of visual attention. This finding is discussed further below. The second important finding calls into question the use of the term "object-based" in reference to this task. When the rectangles were moved closer to one another (the near condition), subjects were again faster to respond to validly cued targets than to invalidly cued targets, and this validity effect did not differ from the validity effect in the far condition. However, within the invalid trials, the movement of attention (within a rectangle vs. across rectangles) interacted with the distance of the two rectangles. That is, the cost in shifting between the two rectangles was significantly reduced when the two rectangles were moved closer to one another. This finding demonstrates that the nonspatial finding reported by Egly et al. can indeed be influenced by a spatial manipulation. What implication does this interaction have for Egly et al.'s findings? The major implication of this result is that it suggests that the object-based selection argued for by Egly et al. is not object-based selection, assuming that object-based selection is unaffected by location manipulations (Vecera & Farah, 1994). If selection in this task were truly object-based, then manipulating the distance between the ends of the rectangles should have little if any effect on the movement of attention within and across rectangles. In particular, the advantage for within-rectangle shifts relative to between-rectangles shifts should have been the same in the far and near conditions, resulting in a main effect of attention shift but no interaction. Given this, what type of selection is occurring in this task? The present results are consistent with attentional selection from a grouped location-based (or grouped array format) representation. This type of representation contains features or locations (or both) that have been grouped according to whether they belong to the same shape or not (see Kramer & Jacobson, Shifts Within Versus Between Rectangles 370 360 350 340- Within Rectangle Across Rectangles Discussion There are several findings of theoretical relevance from this study. First, the results of the far condition replicate Egly et al.'s results. Subjects were faster to respond to targets that followed a valid cue than to those that followed an invalid cue. This finding confirms a spatial component of attentional selection. However, on the invalid trials alone, subjects were faster to respond when the cue and target were located within the same object than when they were located across different objects. This finding confirms a nonspatial component of attentional selection. Recall that Egly et al. referred to this component as object-based. Figure 3. Mean reaction times for the invalid cues, broken down by whether the shift of attention was within a rectangle or between rectangles. Note that the error bars contain betweensubject variability.
320 COMMENT 1991; Vecera & Farah, 1994). However, this type of representation is still location based, because if the shape changes ocations, the group of locations also changes. Selection from this type of representation would still be location based (and thus would show spatial effects, as in the present experiment), although it would not be simple location-based selection, as with a spotlight, zoom lens, or other mechanism that does not respect perceptual groupings. It should also be noted that location-invariant object-based selection is not merely a theoretical construct. Recent results have supported object-based selection irrespective of object location (Vecera & Farah, 1994). Thus, a simple spatial manipulation can be diagnostic in determining whether selection is occurring from a grouped array or from an object-based representation. Next, it is important to note that even in the near condition there was a significant difference between moving attention within an object as compared with moving attention between objects: Movements of attention within a rectangle were faster than movements across rectangles. (Note that this finding does not compromise the above finding and conclusion, because an object-based theory would predict no interaction between distance and movement of attention within and between rectangles.) This finding does, however, suggest that perceptual groupings are extremely strong in the influence they have on locationbased attentional selection. In the near condition, even though a movement across objects was spatially smaller than a movement within an object, there was still an advantage for moving within an object. This suggests that the perceptual groups established by early and intermediate vision can dramatically influence the processing of subsequent visual mechanisms, such as spatial attention. Finally, what implications do the present results have for the work that Egly et al. conducted with neurological patients? With respect to this work, the present findings suggest that left-parietal-damaged subjects may not have a deficit in object-based attention per se, but rather they may have a deficit in attending to grouped array representations. Right-parietal-damaged subjects may have deficits in allocating location-based attention to spatial locations, independent of the structure in the visual field, as Egly et al. suggested. The present experiment does not address these issues, but it offers testable predictions based on the grouped array-object-based distinction. References Cheal, M. L., & Lyon, D. (1989). Attention effects on form discrimination at different eccentricities. Quarterly Journal of Experimental Psychology, 41A, 719-746. Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental Psychology: General, 113, 501-517. Egly, R., Driver, J., & Rafal, R. D. (1994). Shifting visual attention between objects and locations: Evidence from normal and parietal lesion subjects. Journal of Experimental Psychology: General, 123, 161-177. Eriksen, B. A., & Eriksen, C. W. (1974). Effects of noise letters upon the identification of a target letter in a nonsearch task. Perception & Psychophysics, 16, 143-149. Kramer, A. F., & Jacobson, A. (1991). Perceptual organization and focused attention: The role of objects and proximity in visual processing. Perception & Psychophysics, 50, 267-284. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32A, 3-25. Posner, M. I., & Cohen, Y. (1984). Components of visual orienting. In H. Bouma & D. G. Bouwhuis (Eds.), Attention and performance X (pp. 531-556). London: Erlbaum. Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and the detection of signals. Journal of Experimental Psychology: General, 109, 160-174. Reuter-Lorenz, P. A., & Fendrich, R. (1992). Oculomotor readiness and covert orienting: Differences between central and peripheral precues. Perception & Psychophysics, 52, 336-344. Vecera, S. P., & Farah, M. J. (1994). Does visual attention select objects or locations? Journal of Experimental Psychology: General, 123, 146-160. Received December 1, 1993 Revision received January 30, 1994 Accepted February 2, 1994