This study applies deep learning to classify—and interpret—blue-and-white porcelain images by dynastic period and studies the role of multiple object views. A multi-view dataset of Ming and Qing porcelain captured from different angles was curated, resulting in 284 objects and 963 images. Among the evaluated models, a pretrained and fine-tuned ResNet-50 (ImageNet1K-V2) achieved the best performance. Incorporating multiple views of the same object during training improved classification performance compared with a single view. Visual explanation techniques (Grad-CAM, Ablation-CAM, Score-CAM) highlighted shape and motifs as characteristic features. t-SNE visualisation showed learned features cluster by dynasty globally and object locally.
Yapp et al. (Tue,) studied this question.