Visual Object Detection and Recognition

We have studied the problem of detecting unknown objects within indoor environments in an active and natural manner. The visual saliency scheme utilizing both color and depth cues is proposed to arouse the interests of the machine system for detecting unknown objects at salient positions in a 3D scene. The 3D points at the salient positions are selected as seed points for generating object hypotheses using the 3D shape. We perform multi-class labeling on a Markov random field (MRF) over the voxels of the 3D scene, combining cues from object hypotheses and 3D shape. The results from MRF are further refined by merging the labeled objects, which are spatially connected and have high correlation between color histograms.

To improve the performance of vision algorithms, a novel method is proposed to visually segment unknown objects based on inference of object logic states. In the semantic level, the space of object logic states is defined. The object logic states are deduced according to the feedback of robot grasping actions for object. When the logic states of objects change, several predefined rules are activated and some point sets are calculated in order to re-segment the point clouds of the changed objects. The changed object set can also be used to update the space of object logic states.

For object recognition, robots often have limited knowledge about the environment and need to continuously acquire new knowledge in order to collaborate with the humans. To address this issue, we have studied a method which allows the human to teach a robot new object types and attributes through natural language (NL) instructions. The segmented objects as well as their relations are regarded as the basic knowledge of the robot. The NL instructions are processed to domain-specific representations for the robot to identify the target objects. The target objects as well as the object type or attribute labels referred in the NL instructions are collected as training samples for the robot to learn.

Publications

1. J. Bao, A. Song, Z. Hong, T. Shen, H. Tang. Visual Segmentation of Unknown Objects based on Inference of Object Logic States. Jiqiren(Robot), 2017, 39(4):431-438.

2. J. Bao, Z. Hong, H. Tang, Y. Cheng, Y. Jia and N. Xi. Teach Robots Understanding New Object Types and Attributes through Natural Language Instructions. 10th International Conference on Sensing Technology (ICST), 2016.

3. J. Bao, Y. Jia, Y. Cheng and N. Xi. Saliency-Guided Detection of Unknown Objects in RGB-D Indoor Scenes. Sensors, 2015, 15(9): 21054-21074.

4. J. Bao, Y. Jia, Y. Cheng, H. Tang, N. Xi. Feedback of Robot States for Object Detection in Natural Language Controlled Robotic Systems. IEEE International Conference on Robotics and Biomimetics (ROBIO), 2015.