Driving facial expressions is critical for digital animation, yet resource-constrained systems still rely heavily on linear blendshape models. Existing methods struggle with feature redundancy in high-dimensional landmark prediction and numerical instability in extreme expression regions. To address these, we propose Attention-Gate ContextNet (AtG-ContextNet),a regression framework that integrates a temporal attention mechanism with a Hybrid Gating System. AtG-ContextNet utilizes a region-aware autoencoder to encode 468 landmarks into a 32-dimensional latent vector, followed by an eight-head attention mechanism to capture long-range dependencies across 12-frame sequences. The Hybrid Gating System dynamically fuses multiplicative scaling and additive shifting via a learnable coefficient, while first-order derivative compensation ensures temporal coherence. Additionally, a clamping operator and saturation-resistant loss are employed to stabilize boundary regions. After fine-tuning on each target subset (C1-C3) of the 300-VW dataset.After fine-tuning on each target subset (C1-C3) of the 300-VW dataset, AtG-ContextNet achieved fine-tuned R2 values of up to 0.935 and a PSNR exceeding 26 dB, outperforming baselines such as Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting (Crossformer), Sparsely-Gated Mixture-of-Experts (MoE), Topologically Consistent Reweighting for XGBoost (TCR-XGBoost), and DLinear across all training subsets.On the training set, AtG-ContextNet achieved a DW statistic of 1.97, indicating negligible residual autocorrelation and confirming unbiased estimation. However, subset-specific evaluations revealed degraded DW values (1.0–1.2) across C1–C3, suggesting persistent positive autocorrelation in challenging scenarios. While ablation studies validate the architectural components, limitations in gate activation granularity and sparsity control remain to be addressed. Furthermore, visual contrast validation provides further intuitive evidence for the model's stability in expression parameter mapping.Future work will focus on optimizing sparse representations and incorporating biomechanical constraints to further enhance physical plausibility and performance.
Building similarity graph...
Analyzing shared references across papers
Loading...
Yi Ru-Ya Zhang
Junting Qian
Bangkokthonburi University
Qian Zhang
Minzu University of China
Journal of King Saud University - Computer and Information Sciences
Minzu University of China
Suzhou Polytechnic Institute of Agriculture
Bangkokthonburi University
Building similarity graph...
Analyzing shared references across papers
Loading...
Zhang et al. (Thu,) studied this question.
synapsesocial.com/papers/69edab424a46254e215b3610 — DOI: https://doi.org/10.1007/s44443-026-00699-2