Researchers from MIT and Rice University have discovered new computer vision technology that utilizes reflections to image the world. Their method uses reflections to turn glossy objects into “cameras,” enabling a user to see the world as if they were looking through the “lenses” of everyday objects. The new technology is expected to have an important impact on the safety and navigation of autonomous vehicles.
Understanding that reflections can be distorted and offer a limited view of the world, the team developed a three-step technique called ORCa (Objects as Radiance-Field Cameras). First, they capture images of an object taken from different angles, then convert its glossy surface into a virtual sensor which captures reflections. An AI system is used to map these reflections to enable it to estimate depth in the scene and capture novel views that would only be visible from the object’s perspective.
“We have demonstrated that any surface can be converted into a sensor using this formulation that transforms objects into virtual pixels and sensors. This technique holds enormous potential across multiple domains,” said Kushagra Tiwary, co-lead author of the research.
ORCa not only allows an observer to see around corners or beyond objects that block their view, it can also estimate the depth between the shiny object and other elements in the scene, and predict the object’s shape. ORCa models the scene as a 5D radiance field, which captures information about the intensity and direction of light rays that emanate from and strike each point in the scene.
“It was especially challenging to go from a 2D image to a 5D environment. You have to make sure that mapping works and is physically accurate, so it is based on how light travels in space and how light interacts with the environment. We spent a lot of time thinking about how we can model a surface,” Tiwary says.
Further work will explore applications in drone imaging, where ORCa could use faint reflections from objects a drone flies over to reconstruct a scene from the ground, and the enhancement of the system’s capabilities by incorporating additional visual cues, such as shadows, to reconstruct hidden information, or combine reflections from two objects to image new parts of a scene.