The deployment of Facial Expression Recognition (FER) systems in resource-constrained edge computing environments is hindered by the high computational demands that often impede their operational efficacy. This paper introduces the Lightweight Attentional Convolutional Capsule Network (LAC-Capsnet), which has been carefully designed to achieve competitive classification performance while requiring a significantly reduced cardinality of trainable parameters. Its inherent novelty is the synergistic combination of an attentional mechanism (STN for geometric invariance and SE block for channel recalibration) with effective microarchitectures (Inception, Depthwise Separable Convolutions). To leverage dynamic routing and achieve a more refined spatial hierarchy modeling, this convolutional backbone is seamlessly integrated with a capsule network. Based on comprehensive validation, LAC-Capsnet achieves classification accuracy of 97.15% on CK + and 66.70% on FER2013, while requiring only 0.97 million parameters. As a result, it was demonstrated to be a cost-effective and lightweight FER solution that showed efficacy and computational parsimony.
Khanbebin et al. (Thu,) studied this question.