Abstract Facial expressions are the key indicators to express the actual emotional state of the person based on their thoughts, and recognizing emotion from these real-world facial expressions is a challenging task, as, according to literature, the current automatic emotion recognition systems are dependent on the training facial expression images on which they have learned patterns that are unable to generalize to novel unseen facial expressions with occlusion and pose variations. The limited generalizability of the current AER system poses a hurdle to its practical use in real-world applications, underscoring the need to address the generalization problem outlined above. This study proposes the Feature Semantic Similarity Comparison Network (FSSC-Net), a deep learning classifier based on a similarity-metric learning approach, designed to address the potential generalization problem in emotion recognition from facial expressions. When the novel unseen facial expression is input to FSSC-Net, the FSSC-Net objective is to find the best semantically similar pair by semantically comparing features of the novel unseen input with all facial expressions’ true features stored in the emotion database; then, the class corresponding to the maximum semantic similarity match (MSSM) is assigned to the unseen input facial expressions. The maximum semantic similarity match (MSSM) is computed by FSSL-Net, which is trained on triplet facial expression data using the FSSM-Loss function, with the objective that intra-class facial expressions are close to each other while inter-class facial expressions are farther apart in feature space. Experimental results on AffectNet and RAF-DB datasets demonstrate the effective generalizability of our system in many challenging real-world facial expression recognition applications.
Bhati et al. (Thu,) studied this question.