Vision Research Roadmaps
Links to pages with explanations of the direction of vision research projects at WG.
See also: Planning research roadmaps
Visual SLAM and 3D Reconstruction
Kurt
The goal of this project is to keep track of robot 3D motion using pose estimation from stereo vision. On a local basis, visual odometry gives fine-grained pose estimation for registration of sensor data from vision and ladar sensors. On a global basis, visual SLAM constructs a map of the environment, and keeps the pose estimates consistent within this map. One of the key features here is "place recognition", where images are quickly matched and registered against a global database of images. The robot should be able to 'wake up' anywhere and immediately figure out where it is, if it has been there before.
Given consistent pose estimates, point clouds from stereo and ladar are correctly registered in a global frame. Abstraction methods in a 3D reconstruction pipeline create reduced data formats that are easier to use for applications such as obstacle avoidance. The 3D pipeline produces a consistent 3D geometry that is remembered as the robot moves away, and actively updated when re-viewed.
Object Recognition
Gary
The goal of this project is to detect and recognize objects and their affordances (what you can do with them). Naturally, this is too broad for a PR2 release roadmap, so we define a subset of this aimed at PR2 launch time frame:
For PR2 Release:
- We want recognize manipulatable, mostly rigid, fairly Lambertian objects on a table top within 5 feet in moderate clutter;
- Segment object pose in 3D enough to accurately to grasp them;
- Speed up detection 100x vs. exhaustive search by using potential object features such as boundary edges.
Basically, for release, we want the robot to reliably identify several 10s of fairly light, easily graspable objects so that planning work for tasks may be tested in the real world. But, this method must allow scaling to 1000's of objects, to incremental learning of new objects, allow for adding features (such as lighting, shading, texture etc) so that the method forms the basis of tacking articulated, deformable and/or non-Lambertian objects.
People Tracking
Caroline and Jeremy
Robotic interaction with people differs from that with other objects. People are unpredictable, highly deformable, have a different notion of personal space, can willfully interact with the robot, can teach a robot about its environment and tasks, and most importantly, utility to people is often a robot's main measure of success. The first steps in human-robot interaction are detecting people, tracking them and estimating their pose and actions relative to the environment and other objects. This project aims to solve these issues using multiple sensors and in real time for indoor environments with multiple occlusions.
In the near term, we intend to identify the number of people in a room and their approximate location. This would allow the PR2 to better navigate its environment by increasing its distance to people, decreasing its distance to other objects, and better predicting where people (obstacles) may be in the future.
In the long term, we hope to identify people's poses and actions more precisely to allow for even better navigation, finer interactions (for example, passing an object to a person), and as context for object recognition (a person sits on a chair).