June 1, 2022

CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Generating shapes using natural language can enable new ways of imagining and creating the things around us. While significant recent progress has been made in text-to-image generation, text-to-shape generation remains a challenging problem due to the unavailability of paired text and shape data at a large scale. We present a simple yet effective method for zero-shot text-to-shape gener-ation that circumvents such data scarcity. Our proposed method, named CLIP-Forge, is based on a two-stage training process, which only depends on an unlabelled shape dataset and a pre-trained image-text network such as CLIP. Our method has the benefits of avoiding expensive inference time optimization, as well as the ability to generate multiple shapes for a given text. We not only demonstrate promising zero-shot generalization of the CLIP-Forge model qualitatively and quantitatively, but also provide extensive compar-ative evaluations to better understand its behavior.

Preguntar a la IA

Me gusta

Guardar

Cite This Study

Sanghi et al. (Wed,) studied this question.

synapsesocial.com/papers/6a09e9eeb0d552aa8b45f47d https://doi.org/https://doi.org/10.1109/cvpr52688.2022.01805

Also Consider

Synapse has enriched 3 closely related papers on similar clinical questions. Consider them for comparative context:

Preguntar a la IA

Me gusta

Guardar