Facial landmarks provide essential representations of facial states and movements, serving as the foundation for numerous face-related tasks. However, traditional optical device-based facial landmark detection (FLD) solutions suffer from limitations in low-light conditions, occlusion sensitivity, and privacy concerns. In this paper, we propose an efficient two-stage facial landmark detection system, CF-FLD, which utilizes millimeter-wave (mmWave) radar signals to reconstruct human faces. Specifically, CF-FLD is composed of coarse-grained affine transformation (CAT) and fine-grained offset transformation (FOT). To characterize large-scale and rigid facial movements caused by head poses and joint motions, CAT defines sparse and representative triangle constraints within and across different facial parts for affine transformation. Based on CAT results, FOT is presented to progressively obtain offset shifts from subtle and non-rigid facial deformations. Instead of resource-intensive area detection or search, FOT designs a multi-level region partition strategy, in which region-wise hybrid network and region-aware attention are constructed to hierarchically refine facial landmarks. Comprehensive evaluations on data collected from 20 participants in real-world environments demonstrate that CF-FLD can accurately localize facial landmarks during facial motion, achieving a mean absolute error (MAE) of 2.02 mm and a normalized mean error (NME) of 3.37% at a low cost.
Sheng et al. (Tue,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: