What type of study is this?

This is a Quantitative Study study.

September 30, 2025Open Access

Comparing CNN and ViT for Open-Set Face Recognition

Key Points

ViT demonstrated the highest precision in identifying unknown individuals, surpassing CNN models significantly.
The study evaluated multiple models, showing that pre-trained architectures can enhance open-set recognition success.
Open-set recognition poses challenges due to the need for effective identification of unknown faces not in training data.
Choosing the right model for face identification is critical, as demonstrated by the performance differences between ViT and CNN.

Abstract

At present, there is growing interest in automated biometric identification applications. For these, it is crucial to have a system capable of accurately identifying a specific group of people while also detecting individuals who do not belong to that group. In face identification models that use Deep Learning (DL) techniques, this context is referred to as Open-Set Recognition (OSR), which is the focus of this work. This scenario presents a substantial challenge for this type of system, as it involves the need to effectively identify unknown individuals who were not part of the system’s training data. In this context, where the accuracy of this type of system is considered crucial, selecting the model to be used in each scenario becomes key. It is within this context that our work arises. Here, we present the results of a rigorous comparative analysis examining the precision of some of the most widely used models today for face identification, specifically some Convolutional Neural Network (CNN) models compared with a Vision Transformer (ViT) model. All models were pre-trained on the same large dataset and evaluated in an OSR scenario. The results show that ViT achieves the highest precision, outperforming CNN baselines and demonstrating better generalization for unknown identities. These findings support recent evidence that ViT is a promising alternative to CNN for this type of application.

Read Full Paperexternally

اسأل الذكاء الاصطناعي

Bookmark

View Full Paper