March 3, 2026Open Access

Vi-SketchGPT: A Novel Multi-Scale and Context-Aware Representation for Sketch Generation and Classification

Key Points

The model achieves high accuracy in sketch classification, ensuring structural coherence and part relationships.
Classification accuracy reached 95% on QuickDraw and TU-Berlin datasets, highlighting its efficiency in processing degraded information.
Methodology involves quadtree decomposition and spatial relationships, leveraging hierarchical transformers for multi-scale dependency capture.
These findings may enable broader applications in computer vision and automated sketch generation systems.

Abstract

Human sketches exhibit substantial variability across individuals in terms of line style, abstraction level and drawing conventions. Unlike realistic images, they provide limited contextual information and rely on highly simplified concept representations. Recognizing and generating sketches therefore requires efficient use of the available information, identification of the most informative local features, interpretation of their meaning within a minimal context, and understanding of the spatial relationships that define the overall structure. In this study, we introduce ViSketch-GPT, a representation and model that can extract these local features, contextualize them within the sketch and encode spatial relationships, thereby enabling a deeper understanding of the sketch structure. Guided by the intuition of the void as information, we leverage Signed Distance Functions (SDF) to reveal this potentially hidden information, organizing it via quadtree decomposition and processing it with a hierarchical Transformer to capture multi-scale dependencies. This structured representation allows the model to support both high-fidelity generation and accurate classification. Experiments on the QuickDraw and TU-Berlin datasets demonstrated that the model classifies sketches with high accuracy while generating outputs that preserve structural coherence, respect part relationships, and capture essential conceptual patterns despite the scarcity of information in the original sketches.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Giulio Federico

Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo"

Giuseppe Amato

University of Salerno

Fabio Carrara

Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo"

Journals

SHILAP Revista de lepidopterología

IEEE Access

Actions

Institutions

Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo"

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Vi-SketchGPT: A Novel Multi-Scale and Context-Aware Representation for Sketch Generation and Classification

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study