Manhattan Room Layout Reconstruction from a Single 360° image: A Comparative Study of State-of-the-art Methods

Chuhang Zou*
University of Illinois at Urbana-Champaign

Jheng-Wei Su*
National Tsing Hua University

Chi-Han Peng
National Chiao Tung University

Alex Colburn
University of Washington

Qi Shan
Apple Inc.

Peter Wonka
King Abdullah University of Science and Technology

Hung-Kuo Chu
National Tsing Hua University

Derek Hoiem
University of Illinois at Urbana-Champaign

International Journal of Computer Vision (IJCV)

Abstract

Recent approaches for predicting layouts from 360° panoramas produce excellent results. These approaches build on a common framework consisting of three steps: a pre-processing step based on edge-based alignment, prediction of layout elements, and a post-processing step by fitting a 3D layout to the layout elements. Until now, it has been difficult to compare the methods due to multiple different design decisions, such as the encoding network (e.g., SegNet or ResNet), type of elements predicted (e.g., corners, wall/floor boundaries, or semantic segmentation), or method of fitting the 3D layout. To address this challenge, we summarize and describe the common framework, the variants, and the impact of the design decisions. For a complete evaluation, we also propose extended annotations for the Matterport3D dataset, and introduce two depth-based evaluation metrics.

Bibtex

@article{zou20193d,
  title={Manhattan Room Layout Reconstruction from a Single 360$^{\circ}$ image: A Comparative Study of State-of-the-art Methods},
  author={Zou, Chuhang and Su, Jheng-Wei and Peng, Chi-Han and Colburn, Alex and Shan, Qi and Wonka, Peter and Chu, Hung-Kuo and Hoiem, Derek},
  journal={arXiv preprint arXiv:1910.04099},
  year={2019}
}

Links

Dataset Link

LayoutNet

DuLaNet