Facial Emotion Recognition (FER) has emerged as an important research area in artificial intelligence and computer vision due to its wide applications in human–computer interaction, healthcare, surveillance, and affective computing. Recent advancements in deep learning, particularly Convolutional Neural Networks (CNNs), have significantly improved the ability of FER systems to automatically learn discriminative facial features from image data. This review paper presents a comparative study of CNN-based architectures, with a primary focus on ResNet-18 and MobileNetV2 for FER applications. The study examines their architectural design, computational efficiency, and recognition performance across widely used benchmark datasets such as FER2013, CK+, RAF-DB, AffectNet, and JAFFE. In addition, commonly used evaluation metrics, including accuracy, precision, recall, and F1-score, are discussed. The paper highlights key challenges affecting FER systems, including class imbalance, demographic bias, illumination variation, occlusion, and limited generalization across datasets. Furthermore, emerging research directions such as multimodal learning, domain adaptation, synthetic data generation, explainable AI, and personalized FER models are explored. The review concludes that while ResNet-18 provides strong recognition performance, MobileNetV2 offers a more efficient solution for real-time and resource-constrained applications.
Ghatore et al. (Fri,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: