Department or Program

Computer Science

Additional Department or Program (if any)


Primary Wellesley Thesis Advisor

Ellen Hildreth

Additional Advisor(s)

Jeremy Wilmer

Additional Advisor

Eniana Mustafaraj


In order to navigate through the environment, recognize objects, and interact physically with object surfaces, we need to recover the 3-D layout of a visual scene from the 2-D images that are projected onto the eyes. A primary cue used by the human visual system to perceive the depths of surfaces in the scene is stereo disparity (Marr & Poggio, 1979; Howard & Rogers, 2002; Brown, Burschka, & Hager, 2003; Harris & Wilcox, 2009). Stereo disparity arises from the difference in perspective provided by the two eyes. As a result of this difference, objects can appear at slightly different positions in the left and right images. The human visual system is able to detect this disparity in position and use it to infer depth (Figure 1). For tasks such as the recognition and manipulation of objects in the scene, it is important to segment the image into regions that belong to distinct objects. A strong cue to the presence of an object boundary is a large change in depth between two adjacent image regions. Stereo processing enables the detection of these boundaries and computation of the relative depth between surfaces meeting at boundaries in the image.