THE PERCEPTION AND MEMORY OF STEREOSCOPIC DEPTH INFORMATION IN NATURALISTIC OBJECTS

Similar documents
Does scene context always facilitate retrieval of visual object representations?

Viewpoint dependent recognition of familiar faces

Viewpoint-dependent recognition of familiar faces

Short article Detecting objects is easier than categorizing them

Natural Scene Statistics and Perception. W.S. Geisler

Cultural Differences in Cognitive Processing Style: Evidence from Eye Movements During Scene Processing

Distinguishing between Category-based and Similarity-based Induction

Categorization and Memory: Representation of Category Information Increases Memory Intrusions

Testing Conditions for Viewpoint Invariance in Object Recognition

SENG 412: Ergonomics. Announcement. Readings required for exam. Types of exam questions. Visual sensory system

Sensation & Perception PSYC420 Thomas E. Van Cantfort, Ph.D.

Adapting internal statistical models for interpreting visual cues to depth

Mental Imagery. What is Imagery? What We Can Imagine 3/3/10. What is nature of the images? What is the nature of imagery for the other senses?

A contrast paradox in stereopsis, motion detection and vernier acuity

How Many Colors Can You Remember? Capacity is about Conscious vs unconscious memories

PSYCHOLOGICAL SCIENCE. Research Article. Materials

Differing views on views: response to Hayward and Tarr (2000)

Differences of Face and Object Recognition in Utilizing Early Visual Information

Discrete Resource Allocation in Visual Working Memory

EFFECTS OF NOISY DISTRACTORS AND STIMULUS REDUNDANCY ON VISUAL SEARCH. Laurence D. Smith University of Maine

IAT 355 Perception 1. Or What You See is Maybe Not What You Were Supposed to Get

Scene recognition following locomotion around a scene

Shape memory for intrinsic versus accidental holes

Sensation and Perception

THE SPATIAL EXTENT OF ATTENTION DURING DRIVING

Gist of the Scene. Aude Oliva ABSTRACT II. THE NATURE OF THE GIST I. WHAT IS THE GIST OF A SCENE? A. Conceptual Gist CHAPTER

Feature binding in object-file representations of multiple moving items

Normative Representation of Objects: Evidence for an Ecological Bias in Object Perception and Memory

To what extent do unique parts influence recognition across changes in viewpoint?

August 30, Alternative to the Mishkin-Ungerleider model

Neural codes PSY 310 Greg Francis. Lecture 12. COC illusion

Encoding of Elements and Relations of Object Arrangements by Young Children

The synergy of top-down and bottom-up attention in complex task: going beyond saliency models.

Differing views on views: comments on Biederman and Bar (1999)

Discriminability of differences in line slope and in line arrangement as a function of mask delay*

Sensory Cue Integration

(Visual) Attention. October 3, PSY Visual Attention 1

Laboratory for Shape and Depth/Distance Perception

CHAPTER 6: Memory model Practice questions at - text book pages 112 to 113

How does attention spread across objects oriented in depth?

Task and object learning in visual recognition

Gaze Bias Learning II. Linking neuroscience, computational modeling, and cognitive development. Tokyo, Japan March 12, 2012

Do you have to look where you go? Gaze behaviour during spatial decision making

Multimodal interactions: visual-auditory

Viewpoint Dependence in Human Spatial Memory

Goodness of Pattern and Pattern Uncertainty 1

Perception. Chapter 8, Section 3

Aging and the Detection of Collision Events in Fog

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

ID# Exam 1 PS 325, Fall 2007

Orientation Specific Effects of Automatic Access to Categorical Information in Biological Motion Perception

A FRÖHLICH EFFECT IN MEMORY FOR AUDITORY PITCH: EFFECTS OF CUEING AND OF REPRESENTATIONAL GRAVITY. Timothy L. Hubbard 1 & Susan E.

Measurement of Individual Changes in the Performance of Human Stereoscopic Vision for Disparities at the Limits of the Zone of Comfortable Viewing

Pupil Dilation as an Indicator of Cognitive Workload in Human-Computer Interaction

The perception of motion transparency: A signal-to-noise limit

Gestalt Principles of Grouping

Types of questions. You need to know. Short question. Short question. Measurement Scale: Ordinal Scale

Categorical Perception

ID# Exam 1 PS 325, Fall 2004

Hierarchical Stimulus Processing by Pigeons

Fundamentals of Cognitive Psychology, 3e by Ronald T. Kellogg Chapter 2. Multiple Choice

Systematic perceptual distortion of 3D slant by disconjugate eye movements

Depth aliasing by the transient-stereopsis system

Memory for moving and static images

Selective attention and cyclopean motion processing

Satiation in name and face recognition

View-Specific Coding of Face Shape Linda Jeffery, 1 Gillian Rhodes, 1 and Tom Busey 2

What Effect Do Schemas Have On The Recall Of

Today: Visual perception, leading to higher-level vision: object recognition, word perception.

2012 Course: The Statistician Brain: the Bayesian Revolution in Cognitive Sciences

FAILURES OF OBJECT RECOGNITION. Dr. Walter S. Marcantoni

Spatial updating during locomotion does not eliminate viewpoint-dependent visual object processing

Attention and Scene Perception

Vision and Action. 10/3/12 Percep,on Ac,on 1

Distinctiveness and the Recognition Mirror Effect: Evidence for an Item-Based Criterion Placement Heuristic

Object recognition under sequential viewing conditions: evidence for viewpoint-specific recognition procedures

Encoding `regular' and `random' sequences of views of novel three-dimensional objects

Image generation in a letter-classification task

Generalized learning of visual-to-auditory substitution in sighted individuals

The Effect of Training Context on Fixations Made During Visual Discriminations

Psychologische Forschung Springer-Verlag 1993

Framework for Comparative Research on Relational Information Displays

Fleishman s Taxonomy of Human Abilities

Frank Tong. Department of Psychology Green Hall Princeton University Princeton, NJ 08544

Principals of Object Perception

9.65 Sept. 12, 2001 Object recognition HANDOUT with additions in Section IV.b for parts of lecture that were omitted.

Optical Illusions 4/5. Optical Illusions 2/5. Optical Illusions 5/5 Optical Illusions 1/5. Reading. Reading. Fang Chen Spring 2004

The dependence of motion repulsion and rivalry on the distance between moving elements

Organization of Visual Short-Term Memory

Cognitive Processes PSY 334. Chapter 2 Perception

l3;~~?~~~,'0~'~~t~t:~:~~~~~~~~~~!,1

Real-World Objects Are Not Represented as Bound Units: Independent Forgetting of Different Object Details From Visual Memory

Are Retrievals from Long-Term Memory Interruptible?

Differences in temporal frequency tuning between the two binocular mechanisms for seeing motion in depth

Explicit and Implicit Memory for Rotating Objects

Virtual Reality Testing of Multi-Modal Integration in Schizophrenic Patients

DIFFERENT CONFIDENCE-ACCURACY RELATIONSHIPS FOR FEATURE-BASED AND FAMILIARITY-BASED MEMORIES

Accounts of blending, distinctiveness and typicality in the false recognition of faces

Sensation vs. Perception

Experimental design for Cognitive fmri

Transcription:

THE PERCEPTION AND MEMORY OF STEREOSCOPIC DEPTH INFORMATION IN NATURALISTIC OBJECTS Technical Report June 20, 1996 Thomas A. Busey, Ph.D. Indiana University Please send correspondence to: Thomas A. Busey Department of Psychology Indiana University Bloomington, IN 47405 email: busey@indiana.edu

STEREO PROJECT REPORT PAGE 1 Abstract In two experiments with 162 participants, we examined the role of stereoscopic depth cues in the memory for objects and scenes. Stereoscopic depth cues are formed by disparities in the locations of objects in the images formed on the two eyes, and are naturally used by most observers to recover the distances to objects in the world. Using naturalistic pictures, we determined that stereo depth cues do persist in memory, as long as the observer is given several seconds to view the image and extract the depth relations from the scene. However, we did not find direct evidence that stereoscopic depth cues increase the rate of information extraction from scenes. One implication of the present work is that objects that will be recognized in the real world should be initially presented using stereoscopic depth cues.

STEREO PROJECT REPORT PAGE 2 The world around us has three dimensions, although we often represent scenes using flat, 2D representations such as photographs or movies. However, by virtue of the geometry of space and the separation of our two eyes by several inches, we perceive two slightly different views of the real world. These different views introduce subtle changes in the locations of objects. Most observers can use these differences, or disparities, to recover the distances of objects in depth. The use of disparities as depth cues is known as stereoscopic depth, and these enable us to perceive objects in three dimensions. However, these stereoscopic depth cues by no means the only cue to depth. Texture, shading, relative size, position, accommodation and occlusion are all monocular depth cues, and in many cases, such as viewing a photograph, these monocular cues are sufficient to reconstruct the relative depth positions of objects in space. Thus while stereoscopic depth cues are available to observes, they are redundant in many respects, leaving open the question of whether observers even use stereoscopic depth cues during normal perception. The purpose of the present work is to examine how observers use stereoscopic depth cues, and how they influence memory for objects presented in 3D. The perception of stereo depth cues has been thoroughly addressed by a number of researchers, most notably Bela Julesz (Julesz, 1971), who demonstrated that observers can use stereoscopic cues for perception, both in the presence and absence of monocular depth cues. Despite this groundbreaking work, more recent investigations have yet to derive a satisfactory theory of stereopis (Shipley, 1987). Moreover, the utility of stereo depth cues has recently been challenged by Grimson (1993), who argued that stereoscopic depth cues are not used primarily as cues to depth relations, but are instead used for figure-ground segmentation. In this view, a first step towards object recognition is separating surfaces that belong to different objects. This figure-ground segmentation problem is solved quickly and accuracy by stereo depth cues, where different surfaces are assigned different depths in space and therefore are likely to belong to different objects. Grimson argues that use of stereoscopic disparities to recover accurate depth requires precise calibration of the viewing apparatus and such precision may be beyond the capacities of the human visual system. Figure-ground segmentation by stereo depth cues is indeed a very fast operation, although we can do figure-ground segmentation on scenes without disparities, such as photographs. To summarize, the role of stereo depth cues in perception is still an open question. Observers can use stereo depth cues to perceive objects, but whether they actually rely on these cues during normal perception is less certain. While the use of stereoscopic depth cues for perceiving objects has been widely studied, the representation of these depth cues in memory has received little attention, and

STEREO PROJECT REPORT PAGE 3 the question of whether stereoptic depth cues are preserved in memory is still unclear. Memory for pictures is by no means a distortion-free process, and considerable recoding of the visual information may take place. Images that are preserved in memory are subject to characteristic biases such as contraction to include objects and area outside the picture frame (Intraub, Bender & Mangels, 1992). In addition to these biases, a re-coding of the visual information to more abstract verbal information may take place, in which depth relations are lost. For example, an observer might remember that a scene contained a boy, a dog and a tree, but not the relative locations of each object in depth. Research on the use of stereoscopic depth cues in object recognition gives some clues to the representation of objects in memory, although very few experiments have addressed the long-term maintenance of these cues in memory. The debate centers around whether objects are represented in memory as a linked series of viewpoint-specific two-dimensional images (Tarr and Pinker, 1989) or as viewpoint-invariant three-dimensional representations (e.g. the geon-based GSD theory of Biederman and colleagues: Biederman and Gerhardstein, 1993; Hummel and Biederman, 1992). The former view is consistent with the storage of a relatively large number of raw images, from which little abstraction takes place. The later view holds that observers create three-dimensional mental representations that extract depth relations to create mental models that are more isomorphic to the physical object than to the perceptual image that generated the initial percept of the object. This new representation allows mental operations such as rotation and even structural changes that are analagous to the physical manipulations on real objects. Various models exist along the continuum defined by the above models. However, for the purposes of the current article the important issues are how objects are maintained in memory, and whether these representations include stereoscopic depth cues. Most experiments do not use stereoscopic depth cues, and thus any three-dimensional memory representation must be derived from monocular depth cues or inferred from multiple views of the same novel object. However, Edelman and Bulthoff (1992) examined the addition of stereoscopic depth cues to novel objects. Observers were trained on a series of novel objects under conditions of full stereoscopic depth cues. They were then tested with identical and similar objects in different orientations, and indicated whether each object was one they had previously seen. The error rate decreased when stereo cues were added during the test session by a factor of two (Edelman and Bulthoff, 1992), suggesting that depth cues were useful during object recognition. However, their overall results are at odds with theory that propose three-dimensional, viewpoint-invariant memory representations. While it does appear that observers construct abstractions based on different views of the same object,

STEREO PROJECT REPORT PAGE 4 such abstractions appear to be collections of specific views of an object, assisted by limited depth information that provide a partial viewer-centered three-dimensional representation (Bulthoff, Edelman and Tarr; 1994). In summary, stereoscopic depth cues appear to assist the segregation of objects from each other, and provide relative depth information that is used to create abstractions or collections of views based on individual perceptual images. Under this view, object recognition is seen as a process of interpolation from the nearest image-based representation in memory to match the currently-viewed object. This leaves the role of stereoscopic depth cues in memory in a somewhat uncertain state: while they are clearly used for perception and can assist object recognition, whether they persist in memory is still an open question, one addressed by the current article. Two Models Based on the object recognition and binocular vision literature, we can construct two (non-exclusive) models of how stereoscopic depth cues might enhance perception and memory. Before describing these models, a brief description of the experimental paradigm is in order. We will then describe the models and derive predictions for each. The current experiments use three-dimensional photographs taken with a dual-camera stereo rig. These photographs are scanned and displayed on 21 color monitors with polarizing filter stereoscopic display capabilities. The experiments typically have two parts: a study session, in which 60 pictures are shown sequentially, and then a test session in which the 60 old pictures are intermixed with 60 new pictures. The participants in the experiment go through each test picture and indicate whether the picture was in the original list (old) or is a new picture (new). We add to this the stereo manipulation: a study picture could either be presented with stereoscopic depth cues (STEREO) or without these cues (FLAT). The flat pictures are generated by presenting either the left or right picture to both eyes. This stereo/flat manipulation is duplicated for the test pictures as well, giving four possible conditions as indicated by Table 1. Study Depth This design systematically varies the Test Depth Flat Stereo cues that are used to probe memory, and Study Flat Study Stereo Flat provide data that test the following two Test Flat Test Flat Study Flat Study Stereo models of the effects of stereoscopic depth Stereo Test Stereo Test Stereo cues. Table 1. Four conditions for Experiments 1 and 2. The Speed-Up Model Following the views of Julesz, Grimson and others who have argued that stereoscopic depth cues can be and are used in perception, one might predict that these additional cues

STEREO PROJECT REPORT PAGE 5 serve to make stereo pictures more memorable or more durable in memory. Under the Speed-Up Model, stereoscopic depth cues assist the initial perception of 3D scenes, either through figure-ground segregation or increasing the rate of information extraction. Thus even if stereoscopic depth cues do not persist in memory, they do make 3D pictures more memorable by providing more information in memory. The predictions of the Speed-Up Model are straightforward. Items that are studied in stereo are recognized more often than items that are studied flat, regardless of how these are tested. Thus this model predicts an overall main effect of studying a picture in stereo verses studying the picture flat. One issue that is relevant to the Speed-Up model is that of novelty: stereo pictures for our participants are novel, and therefore more interesting than the flat pictures. Thus participants may pay more attention to stereo pictures, which would give results consistent with the Speed-Up model. We tried to address this issue in three ways. First, we gave participants lots of experience with stereo pictures at the beginning of the experiment. We had them view a very well-composed scenic picture that had a great deal of depth, and we explained how stereo worked while they viewed this picture. We then gave a practice minisession in which they had an opportunity to see several other stereo pictures. Second, we told our subjects that half of the pictures will be in stereo and half will be flat, and we told them that they would have to recognize both kinds later on. While neither of these may have been sufficient to completely eliminate novelty effects, one could argue that the mechanism behind an increase in information processing is simply increased attention on the part of the participants. This would require an acknowledgment (perhaps long overdue) that participants bring expectations, interests and emotions to experiments, and that these can affect their data. The Depth Coding Model Under the Depth Coding model, stereoscopic depth cues are stored in memory, either as an intrinsic part of a three-dimensional representation (such as Biederman's Geon theory) or extrinsically as part of a list of features (e.g. the picture contained a boy, a dog and a tree, and it was in depth ). Here we assume that the stereoscopic depth cues which contribute to perception are relevant to the object and the task, and are thus stored in memory. The predictions of the Depth Coding model follow the Encoding Specificity Hypothesis of Tulving & Thomson (1973), which states that recognition memory will be best when the study and test conditions match. Thus recognition performance will be best in the study flat/test flat and study stereo/test stereo condition and poor in the study stereo/test

STEREO PROJECT REPORT PAGE 6 flat and study flat/test stereo conditions. The Depth Coding model would be supported by finding a crossover interaction between study and test conditions. Note that the Speed-Up and Depth Coding models are not exclusive: it is possible, even reasonable, to assume that stereo pictures at study would provide more information about an object, but also remain in memory and be probed best by a test stimulus that also contain stereo depth cues. In this case we would expect an overall main effect of study depth combined with a crossover interaction between study and test depth. Experiments 1 and 2 are designed to confirm or disconfirm each of these two models independently. Experiment 1 uses pictures of cars, trees and hallways presented for 5 seconds at study, while Experiment 2 uses the same stimuli but presented each picture for 600 milliseconds (ms) at study. To anticipate, in Experiment 1 we find evidence for the Depth Coding Model, but not for the Speed-Up Hypothesis. In Experiment 2 we find an overall main effect of test condition, suggesting that the development of a robust threedimensional memory representation may take several seconds. Experiment 1 Experiment 1 addresses two questions: do stereoscopic depth cues assist in the perception of naturalistic scenes, and do they remain in memory to make pictures viewed in stereo more memorable? Note that neither answer may be inferred a priori simply by the mere fact that the stimuli contain stereoscopic 3d cues: participants without the ability to perceive stereoscopic depth cues still perceive the pictures and still can do quite well in our experiments. Thus the real question is whether participants can and will use stereoscopic cues as aids to memory, or are they relying on more abstract representations devoid of the original depth cues? Methods Experiment 1 adopts a standard picture-memory paradigm using stereo and flat pictures of cars, trees and hallways. Participants viewed the scenes on two 21 color monitors that present 3D images using LCD circularly-polarizing panels. Stimuli and Apparatus The stereo and flat pictures were presented using two 21 color monitors equipped with Tektronix LCD stereoscopic panels and controlled by a Pentium computer. Participants wore passive polarizing glasses that allowed only either the left or right image through. The left and right images of a stereo pair were alternatively shown on the monitor at 45 Hz per image (giving a stereo rate of 90 Hz) and the panels were synchronized with these images such that each eye saw only one image, and when fused these two pictures give a vivid, flicker-free impression of depth.

STEREO PROJECT REPORT PAGE 7 The pictures were taken with two 35 mm cameras with 50 mm lenses. The separation between the lenses was 12.7 cm, with is slightly larger than the typical separation between most observer s eyes. This resulted from physical limitations with the cameras, and to compensate for this we followed standard formulas within the stereo photography literature and made certain that the closest object was at least 15 feet from the camera. This prevents unresolvable disparities outside Panum s fusional limit (Julesz, 1971) and minimizes the slight depth exaggeration caused by the increased stereo base. These pictures were then scanned via a slide scanner and cropped to a screen resolution of 800x600 pixels shown in thousands of colors mode (16-bits per pixel). Up to 6 people could participate at once, and were arranged in two rows of desks facing two 21 color monitors. Participants responded on numeric keypads connected to a Macintosh computer that could determine which keypad corresponded to which participant. This computer also controlled the stimulus display computer via a serial cable. Design and Procedure The experiment consisted of three study-test sessions, in which participants viewed 20 pictures of one class of scenes (cars, trees or hallways) and then viewed 40 pictures, half of which were new and half were from the original list. Test pictures from the original list are Targets, while new pictures in the test session are Distracters. Participants indicated for each picture in the test session whether they thought the picture was in the original list (old) or a new picture (new). At study the pictures were shown either in stereo or flat, and the pictures were shown for 5 seconds each, with about 4 seconds in between pictures. Participants were told to expect a test session after each series of pictures, and were given a short practice session at the beginning of the experiment with 3 practice study pictures and 6 practice test pictures. Study and Test pictures could either be shown flat or stereo, giving the four conditions shown in Table 1. During the study session, each picture was preceded by a warning tone. During the test session, the picture remained on the screen until all participants had responded. Participants Participants were 80 Indiana University undergraduate students. Prior to the experiment each participant was given a stereo acuity test (Stereo Reindeer, Randot Corporation) and coded according to their ability to perceive various stereo acuity. Six participants complete the experiment but were excluded from the data analysis due to a lack of stereo depth perception. Results and Discussion

STEREO PROJECT REPORT PAGE 8 Results for Experiment 1 are shown in Figure 1. Performance is computed as a function of hit rates corrected for the false alarm rate. Individual false alarm rates were computed for each participant, and used to correct that individual's hit rates. Because there were two depth conditions in the test session, the distracters were also shown either in stereo or flat depths. This gives two separate false alarm rates, and these were used to correct the hit rates that corresponded to how that condition was tested (e.g. both the study flat/test stereo and study stereo/test stereo conditions were corrected with the stereo false alarm rate. The flat false alarm rate is 0.257 (SEM = 0.011) and the stereo false alarm rate is 0.270 (SEM = 0.016), which were not statistically significantly different. Overall performance was quite high in this experiment, due in part to the relatively long 5 second exposure durations during the study session. A two-way ANOVA reveals no significant main effect of study depth (F(1,60)= 0.28, p>0.05), but a significant interaction Recognition Accuracy (Hits-FAs)/(1-FAs) 1.00 0.90 0.80 0.70 0.60 Stereo Experiment 1 Study Flat Study Stereo 0.50 Test Flat Test Stereo Test Condition Figure 1. Data from Experiment 1. The data show a crossover interaction that is consistent with the Depth Coding model, demonstrating that stereo depth cues do persist in memory and are used when recognizing a previously-seen image. There is no main effect of study condition, which fails to support the Speed-Up hypothesis. Apparently stereo depth cues do not provide an encoding benefit, at least for 5 second exposure durations. Error bars represent one standard error of the mean.

STEREO PROJECT REPORT PAGE 9 (F(1,60) = 5.4, p<0.05). Inspection of the Figure 1 data reveals that this interaction is a crossover interaction: memory for pictures is best when a picture is studied and tested at the same depth. Memory is poorest for conditions in which the study and test depth differ. The crossover interaction evident in the Experiment 1 data confirm the Depth Coding model, and demonstrate that stereoscopic depth cues are preserved in memory and assist in recognition memory. However, recognition for pictures that were studied in stereo was overall no better than recognition for pictures that were studied flat. This argues against the Speed-Up model, suggesting that stereoscopic depth cues do not increase the rate of information acquisition for picture-related information during study. Experiment 2 The Experiment 1 data support the Depth Coding model, although this effect is modest (around 10% benefit for pictures tested in the same depth as they were studied). In addition, the data do not support the Speed-Up model, which is surprising given Grimson s arguments for stereoscopic cues as aids to figure-ground segregation. However, it is likely that such speed-up effects would assist perception only during the initial stages of encoding a scene. Perhaps the long 5 second exposure duration minimized the encoding advantages of the stereo depth cues. In addition, the high recognition rates seen in the Experiment 1 data may have reduced the size of the crossover interaction through ceiling effects. In Experiment 2 we attempted to address both limitations of Experiment 1 by greatly reducing the exposure duration a study, from 5 seconds to 600 ms. If stereo depth cues give an advantage during encoding, we might see this advantage with such short exposure durations. We were concerned that reducing the exposure duration would eliminate the stereo percept entirely, and so we had participants give confidence ratings during the study session that indicated where they thought each picture was in stereo or not. Methods Experiment 2 is similar to Experiment 1, with the exceptions of the use of confidence ratings taken at study to determine if participants could perceive stereoscopic depth cues, and the much shorter, 600 ms exposure durations at study. Procedure During the study sessions, a trial was preceded by a warning tone, and then the target picture appeared in either flat or stereo depth conditions for 600 ms. Following stimulus offset, the participants were asked to give a confidence rating on a 5-point scale indicating how confident they were that the pictures were in depth (1 = very little depth, 3= some depth, 5 = a lot of depth ). Once all participants had responded the computer initiated the next trial.

STEREO PROJECT REPORT PAGE 10 The test session was identical to Experiment 1. Participants Participants were 82 Indiana University undergraduate students. Prior to the experiment they were administered the same stereo acuity test as in Experiment 1. Seven participants failed this acuity test and their data was excluded from the analysis as a result. Results and Discussion Participants had no trouble distinguishing between flat and stereo study pictures during the study session. The mean confidence rating for flat pictures was 2.23, and the mean confidence rating for stereo pictures was 3.90. This difference is statistically significant (t(30) = 19.0, p<0.001). Of primary interest is whether this ability to perceive stereo translates into either an overall main effect of study pictures in stereo (vs. flat) or the encoding specificity effects seen in Experiment 1. The answer is no on both counts. Figure 2 shows the Experiment 2 data. Surprisingly, the data show a main effect of test depth: Recognition accuracy was highest when pictures were tested flat, regardless of how they were studied. A two way ANOVA reveals no main effect of study depth, and no significant interaction. The main effect of test depth was significant (F(1, 60) = 9.23, p < 0.01). This main effect, however, is attributable to the stereo false alarm rate: contrary to Experiment 1, the flat and stereo false alarm rates differed significantly (t (30) = 2.59, p< 0.01). In Experiment 2, flat false alarm rate was 0.364 (SEM = 0.015) and the stereo false alarm rate was 0.433 (SEM = 0.022). This difference accounts for the main effect of Test condition in Figure 1. When the raw hit rates for the four conditions are compared, they do not differ significantly, indicating that the decrease in performance with both stereo test conditions comes from a decrease in overall sensitivity when stereo pictures are used as cues, and also a bias shift to preserve the same overall hit rates.

STEREO PROJECT REPORT PAGE 11 Clearly the test pictures containing stereo depth cues are reducing memory performance. However, there may be evidence for encoding specificity that would explain the significant main effect of test depth. In Experiment 2, the study pictures were on for only 600 ms, while at test the pictures remained visible until all participants had responded. Thus even in conditions where the study and test depths matched, the durations of the stimuli were different, and this difference may have resulted in an encoding specificity difference that produced the Experiment 2 data. While it is possible that 600 ms provides enough information to determine whether a picture is in depth or not, it may not provide enough information to extract out the rich depth relations provided by these depth cues. At test, participants were encouraged to make accurate responses, and they had ample time (up to 10 seconds) to decide whether they had seen a picture before or not. Thus not only could Stereo Experiment 2 0.50 Recognition Accuracy (Hits-FAs)/(1-FAs) 0.40 0.30 0.20 0.10 Study Flat Study Stereo 0.00 Test Flat Test Stereo Test Condition Figure 2. Data from Experiment 2. The data show a main effect of test condition, which is attributable to a loss in sensitivity and a shift in bias to preserve the same overall hit rate for slides tested stereo. This results in a higher false alarm rate for stereo distracters as well, which when used to correct targets tested in stereo results in reduced performance compared with testing with flat slides.

STEREO PROJECT REPORT PAGE 12 they perceive the stereoscopic depth cues, they also had ample opportunity to use these depth cues to extract out the rich depth relations of the objects in the scene. The 600 ms presentations at study did not allow for such extraction, and thus the representation engendered by the stereo test pictures were poor cues to memory. The lack of an interaction between study and test depth fails to support the Depth Encoding model that was confirmed by Experiment 1, but this failure may result from duration differences described above. The lack of a main effect of study depth fails to support the Speed-Up model, which is evident for both pictures tested flat and tested stereo. General Discussion To summarize the results from the two experiments presented here, we find evidence for the Depth Coding model for long (5 second) exposure durations but not for short (600 ms) presentations. This demonstrates that stereo depth cues persist in memory and can be used in later picture recognition tasks. We do not find evidence for the Speed-Up model at any exposure duration, despite the fact that the stereo percept develops, at least partially, for pictures presented at very short exposure durations (600 ms). To reconcile these findings, we suggest that when pictures with stereoscopic depth cues are viewed, two sets of processes occur. The extraction of stereo depth cues is a very fast process (Grimson, 1993) that enables observers to identify the existence of stereoscopic disparities in pictures presented for as brief as 600 ms (Experiment 2). However, such cues appear not to increase the overall information extraction rate, but instead may simply change what information is acquired. These depth cues, while clearly visible to participants, may only assist in the memory for pictures that receive a great deal of encoding (Experiment 1). This lengthy encoding period (5 seconds) may give participants an opportunity to develop an object-oriented representation that preserves depth relations in memory. Rather than remembering simply what objects were in the scene, the participant now stores the locations of these objects in depth and develops a rich mental model that is analagous to the physical object. Implications for Applied Technologies The findings reported here demonstrate that stereo depth cues are perceived, remain in memory, and influence future recognition of the same object. A useful metaphor would be that of a lock and key: the memory trace produced by perceiving an object is only accessible if the appropriate key is used to probe the trace. In this case when objects with stereo depth cues are placed in memory, retrieval will be enhanced when the objects are tested with the same depth cues present.

STEREO PROJECT REPORT PAGE 13 A common applied use of object recognition data such as this is that of television advertising and virtual reality. In these situations, viewing an object is analagous to our study session, and then recognizing it later on is analagous to the test session. These data suggest that for optimal recognition, objects should be studied with the same depth cues that they are tested with. This implies that since objects are recognized in the real world using stereo 3D cues, they should be presented on television or in virtual reality displays using stereoscopic depth cues. Using the analogy from above, this sets up a lock (a trace in memory) that will be best opened using the key that contains stereo 3D cues (a real-world object). The present work looked only at static images, but depth cues are often enhanced when displays include motion. Stereoscopic depth cues help observers build a 3D mental model of the structure of an object, including its shape and the locations of important features. However, such information is available to some degree through motion: a rotating object also appears to have depth. When motion and stereo depth cues are combined in the same display, they work synergistically to provide a vivid illusion of the actual object appearing in front of the display. As more cues to depth are added to a display, the more likely the person will develop a mental representation that is analagous to the real-world object, and thus they are more likely to correctly recognize the physical object. Thus these findings are not only applicable to moving displays, but they likely to have a larger effect on the encoding and storage of the objects.

STEREO PROJECT REPORT PAGE 14 References Biederman, I. & Gerhardstein, P. C. (1993). Recognizing depth-rotated objects: Evidence and conditions for three-dimensional viewpoint invariance. Journal of Experimental Psychology: Human Perception and Performance, 19, 1162-1182. Bulthoff, H., Edelman, S. & Tarr, M. (1994). How are three-dimensional objects represented in the brain? (A. I. Memo No. 1479). MIT. Edelman, S. & Bulthoff, H. H. (1992). Orientation dependence in the recognition of familiar and novel views of 3D objects. Vision Research, 32, 2385-2400. Grimson, W. E. (1993). Why stereo vision is not always about 3D reconstruction (A.I. Memo No. 1435). MIT. Hummel, J. & Biederman, I. (1992). Dynamic binding in a neural network for shape recognition. Psychological Review, 99, 480-517. Intraub, H., Bender, R. & Mangels, J. (1992). Looking at pictures but remembering scenes. Journal of Experimental Psychology Learning, Memory, and Cognition, 18, 180-191. Julesz, B. (1971). Foundations of cyclopean perception. Chicago: The University of Chicago Press. Shipley, T. (1987). Field processes in stereovision. Documenta Opthalmologica, 66, 95-170. Tarr, M. J. & Pinker, S. (1989). Mental rotation and orientation-dependence in shape recognition. Cognitive Psychology, 21, 233-282. Tulving, E. & Thomson, D. M. (1973). Encoding Specificity and Retrieval Processes in Episodic Memory. Psychological Review, 80, 352-373.