Automatic categorization of fine art paintings across multiple semantic facets, such as artist, style, and genre, is fundamental for large-scale digital archiving, semantic indexing, and knowledge organization of cultural heritage collections. In this paper, we propose convolutional neural network (CNN)-Transformer Hybrid Attention model for art paintings categorization (CTHArt), a CNN-Transformer Hybrid Attention network for multitask art painting categorization. The model employs a dual-branch hybrid backbone that combines a CNN stream for fine-grained local texture modeling and a Transformer stream for global compositional and stylistic context learning. To further exploit inter-facet semantic dependencies, we introduce a Cross-Task Attention Head, which enables task-specific classifiers to exchange information through learnable cross-attention interactions. This design supports coordinated facet prediction consistent with knowledge organization principles. We evaluate the proposed framework on three benchmark datasets. Experimental results demonstrate that CTHArt consistently achieves state-of-the-art performance. The proposed approach provides an effective and scalable solution for artificial intelligence (AI)-assisted knowledge organization of art collections.
Building similarity graph...
Analyzing shared references across papers
Loading...
Liangyu Wei
Qianheng Li
Dandan Wang
KNOWLEDGE ORGANIZATION
Shandong Normal University
Building similarity graph...
Analyzing shared references across papers
Loading...
Wei et al. (Thu,) studied this question.
www.synapsesocial.com/papers/69fa8e3804f884e66b530916 — DOI: https://doi.org/10.31083/ko48360