This study investigates deep learning-based facial expression recognition (FER) under mask occlusion, a scenario increasingly common in real-world settings. To overcome the degradation of recognition performance caused by facial masks, we generated a synthetic masked version of the RAF-DB dataset and trained multiple ResNet-18-based models. Several regularization techniques—including label smoothing, dropout (p=0.5), early stopping, and class weighting—were applied to improve generalization and address class imbalance. The best-performing model, Refined-35E, achieved a macro F1-score of 0.6580, a weighted F1-score of 0.7593, and an accuracy of approximately 76% on the test set. Additional experiments varying training duration and ablation of regularization methods revealed trade-offs between overfitting and minority class performance. The results demonstrate that lightweight CNNs, when coupled with targeted training strategies, can effectively mitigate the impact of mask occlusion. Future work will focus on attention-based mechanisms and leveraging real-world masked datasets to further enhance performance.
Liu et al. (Wed,) studied this question.