CTGAN: Semantic-guided Conditional Texture Generator for 3D Shapes

Yi-Ting Pan
National Taiwan University
Chai-Rong Lee
National Tsing Hua University
Shu-Ho Fan
National Tsing Hua University
Jheng-Wei Su
National Tsing Hua University
Jia-Bin Huang
University of Maryland College Park
Yung-Yu Chuang
National Taiwan University
Hung-Kuo Chu
National Tsing Hua University

Abstract

The entertainment industry relies on 3D visual content to create immersive experiences, but traditional methods for creating textured 3D models can be time-consuming and subjective. Generative networks such as StyleGAN have advanced image synthesis, but generating 3D objects with high-fidelity textures is still not well explored, and existing methods have limitations. We propose the Semantic-guided Conditional Texture Generator (CTGAN), producing high-quality textures for 3D shapes that are consistent with the viewing angle while respecting shape semantics. CTGAN utilizes the disentangled nature of StyleGAN to finely manipulate the input latent codes, enabling explicit control over both the style and structure of the generated textures. A coarse-to-fine encoder architecture is introduced to enhance control over the structure of the resulting textures via input segmentation. Experimental results show that CTGAN outperforms existing methods on multiple quality metrics and achieves state-of-the-art performance on texture generation in both conditional and unconditional settings.


Algorithm


Given 3D model as input, we start with texture parameterization to generate the corresponding UV maps and the segmentation maps. The texture generator then takes style code as input and generates the texture maps based on the segmentation maps. To ensure view-consistent results, we divide the style code and separately encode segmentation maps and style image into the structure representation and the style representation using structure encoder and style encoder. Finally, we apply our generated texture maps on the 3D model and produce the 3D textured model.

Results


Qualitative comparison on texture generation. First row: the input style images (only for the conditional part) and the

First row: the input style images (only for the conditional part) and the input 3D models. Bottom 3 rows: the generated 3D textured models for each method using the input data from the first row. Our method produces superior results in generating texture maps that are more similar to style images and more view-consistent.


method produces superior results in generating texture maps that are more similar to style images and more view-consistent.

Links