SMARTANNOTATOR: An Interactive Tool for Annotating Indoor RGBD Images

Yu-Shiang Wong
National Tsing Hua University
Hung-Kuo Chu
National Tsing Hua University
Niloy J. Mitra
University College London
Computer Graphics Forum (Proc. of Eurographics 2015)

Abstract

RGBD images with high quality annotations, both in the form of geometric (i.e., segmentation) and structural (i.e., how do the segments mutually relate in 3D) information, provide valuable priors for a diverse range of applications in scene understanding and image manipulation. While it is now simple to acquire RGBD images, annotating them, automatically or manually, remains challenging. We present SMARTANNOTATOR, an interactive system to facilitate annotating raw RGBD images. The system performs the tedious tasks of grouping pixels, creating potential abstracted cuboids, inferring object interactions in 3D, and generates an ordered list of hypotheses. The user simply has to flip through the suggestions for segment labels, finalize a selection, and the system updates the remaining hypotheses. As annotations are finalized, the process becomes simpler with fewer ambiguities to resolve. Moreover, as more scenes are annotated, the system makes better suggestions based on the structural and geometric priors learned from previous annotation sessions. We test the system on a large number of indoor scenes across different users and experimental settings, validate the results on existing benchmark datasets, and report significant improvements over low-level annotation alternatives.



Algorithm


System overview: Input to the learning phase is a small set of RGBD images with properly annotated labels and 3D structures (highlighted cuboids), based on which the algorithm learns the probability models. In the annotating phase, the system (a) builds the initial 3D structure of an input RGBD image, and predicts object labels using the learned models. (b-d) The user supervises the system by selecting among suggestions (e.g., re-order from ‘pillow’ to ‘nightstand’) while the system automatically refines the 3D structure to resolve ambiguity due to occlusion (e.g., the nightstand is refined to stand against the floor and wall) and re-predicts object labels (e.g., object on top of the ‘nightstand’ is more likely to be a ‘lamp’ than a ‘pillow’). The process iterates until the user approving all the annotated data. The annotated image is shown on the rightmost side and is used to augment the training data.

Video

Acknowledgement

We are grateful to the anonymous reviewers for their comments and suggestions; all the participants of the user study for their time; and Gerardo Figueroa for the video narration. The project was supported in part by the Ministry of Science and Technology of Taiwan (102-2221-E-007-055-MY3 and 103-2221-E-007-065-MY3), the Marie Curie Career Integration Grant 303541, the ERC Starting Grant SmartGeometry (StG-2013- 335373), and gifts from Adobe Research.

Bibtex

@article{wong:2015:SA,
 author = "Yu-Shiang Wong and Hung-Kuo Chu and Niloy J. Mitra",
 title = "SMARTANNOTATOR: An Interactive Tool for Annotating Indoor RGBD Images",
 journal = "Computer Graphics Forum (Proc. Eurographics)",
 volume = "34",
 issue = "2",
 year = "2015"
 }


Links