ABSTRACT Driver facial expression recognition (DFER) plays a critical role in enhancing intelligent vehicle systems and driving safety. However, achieving reliable DFER performance under real‐world conditions remains challenging due to factors such as occlusion, head pose variation, and lighting inconsistencies. This paper proposes a robust and efficient DFER framework named MFII‐Net, which incorporates multi‐scale feature interaction and illumination balance to improve expression recognition in unconstrained driving environments. The architecture consists of three key modules: a Multi‐Scale Feature Interaction Module (MSFIM) that uses dual‐branch separable convolutions to extract fine‐grained and global facial features, a contrast‐boosted channel attention (CBCA) module that enhances feature discrimination by comparing global and local channel distributions, and an illumination balance module (IBM) that integrates wavelet decomposition, gamma correction, bilateral filtering, and multi‐scale CLAHE to reduce lighting‐induced variability. Experimental results on RAF‐DB and AffectNet datasets, including occlusion and pose‐specific subsets, and KMU‐FED datasets show that MFII‐Net outperforms existing state‐of‐the‐art FER models in accuracy and robustness while maintaining high computational efficiency suitable for real‐time driver monitoring applications.
Yanzhao et al. (Thu,) studied this question.