Key points are not available for this paper at this time.
Most previous work focuses on how to learn discriminating appearance features over all the face without considering the fact that each facial expression is physically composed of some relative action units (AU). However, the definition of AU is an ambiguous semantic description in Facial Action Coding System (FACS), so it makes accurate AU detection very difficult. In this paper, we adopt a scheme of compromise to avoid AU detection, and try to interpret facial expression by learning some compositional appearance features around AU areas. We first divided face image into local patches according to the locations of AUs, and then we extract local appearance features from each patch. A minimum error based optimization strategy is adopted to build compositional features based on local appearance features, and this process embedded into Boosting learning structure. Experiments on the Cohn-Kanada database show that the proposed method has a promising performance and the built compositional features are basically consistent to FACS.
Yang et al. (Tue,) studied this question.