Facial expressions are vital channels of non-verbal communication, conveying rich information about emotional and cognitive states in social interactions. Enabling intelligent systems to automatically recognize these expressions makes Facial Expression Recognition (FER) a key task in affective computing. Despite significant progress, existing FER methods still face limitations in effectively and efficiently recognizing facial expressions under real-world conditions, such as those found in driving environments or dynamic, in-the-wild scenarios. This thesis addresses these challenges by proposing several novel deep learning-based models that improve the performance, robustness, and efficiency of FER systems. To this end, several contributions are presented. The work begins with a comprehensive review of FER, highlighting persistent challenges in the field, particularly within the driving context. Subsequently, DFER-GCViT, a Vision Transformer-based architecture tailored for driver FER, is introduced to improve recognition accuracy under conditions of occlusion, pose variation, and lighting changes. To enhance computational efficiency, ShuffViT-DFER is proposed, a lightweight hybrid model that combines convolutional and transformer-based pretrained networks. Furthermore, multimodal approaches are explored using Vision-Language Models (VLMs), particularly CLIP. CLIVP-FER is developed to integrate visual and textual features, enhancing semantic understanding in the driving context. The research then shifts toward general dynamic FER, leveraging parameter-efficient fine-tuning and temporal modeling of CLIP, and maintains strong performance with reduced computational cost through the proposed PE-CLIP. Extensive experiments on benchmark datasets demonstrate the effectiveness of the proposed models across diverse and challenging conditions. The results underscore the importance of optimizing architectures, incorporating multimodal cues, and enabling lightweight FER systems suitable for practical deployment.
Ibtissam Saadi (Thu,) studied this question.