Manipulator and Object Tracking for In-Hand Model Acquisition

UW Robotics and State Estimation Lab

In this project we address the problem of active object investigation using robotic manipulators and depth sensors. To do so, we jointly tackle the issues of sensor to robot calibration, manipulator tracking, and 3D object model construction.

Project Contributors

Michael Krainin, Dieter Fox, Peter Henry

Overview

Many tasks in mobile robotics and related fields could benefit from the ability for robots to autonomously collect object models. Allowing robots to investigate objects alleviates the need for humans to provide such models and additionally provides data that is helpful for object classification and other tasks. The goal of this project is to provide robots equipped with depth sensors a means of actively exploring objects through the use of their manipulators. We base our technique on the Articulated Iterative Closest Point algorithm (A-ICP) for matching pairs of point-clouds. One point-cloud is obtained by ray-tracing a CAD model of the robot, while the other captured by the depth sensor. We provide an elegant, Kalman filter-based framework for simultaneously adjusting sensor to robot calibration, tracking the joint angles of the robot, and integrating the different views of the object into a single object model.

Model Construction

Object modeling and manipulator tracking are not typically solved together; however, solving them jointly has some nice properties. Tracking the manipulator provides information on how the object itself moves, which is needed for constructing the model. Modeling the object, on the other hand, enables tracking the object, which in turn improves manipulator tracking. To capture these dependencies, we include points from the model in the ray-traced point cloud, attached to the palm via a soft link. We additionally introduce state priors into A-ICP to allow the algorithm to effectively balance between alternatives when ambiguities exist among sensor calibration, joint angles, or object pose. We combine the encoder outputs of the robot with A-ICP estimates using a Kalman filter. Finally, the state estimates from the Kalman filter allow for effective hand vs. object segmentation in the depth scans as well as incorporation of new object points into the model.

Initial Results

Below we show the result of hand tracking on noisy encoder data. In grey are the points from our depth sensor, in white is the ray-traced point-cloud for the noisy angles, and in red is the cloud for the recovered angles. Notice that while the white cloud has drifted from the proper manipulator alignment, the red has remained in good agreement with the sensor data.

Using the output of our Kalman filter, we are able to classify points from the depth sensor as being part of the hand or object. To do so, we use the relationship of the sensor points to the expected poses of the hand and object. In the image below, we show classification results for a hand holding a soda can, a mug, and a paper cup. Hand points are shown in red; whereas, object points are shown in blue. Also pictured are the object models constructed from multiple stereo frames.