What question did this study set out to answer?

This research aims to develop a regression framework for accurately predicting facial expressions using a novel architecture.

April 26, 2026Open Access

AtG-ContextNet: a temporal attention and hybrid gating architecture for facial blendshape coefficient regression

Key Points

This research aims to develop a regression framework for accurately predicting facial expressions using a novel architecture.
Proposed AtG-ContextNet integrates a temporal attention mechanism with a Hybrid Gating System.
Utilized 468 landmarks encoded into a 32-dimensional vector with an eight-head attention mechanism for long-range dependencies.
Fine-tuned on subsets of the 300-VW dataset with specific evaluations for performance metrics.
Achieved fine-tuned R2 values up to 0.935 and PSNR greater than 26 dB.
Demonstrated negligible residual autocorrelation with a DW statistic of 1.97, indicating unbiased estimation.
Revealed degraded DW values of 1.0–1.2 in subsets, suggesting persistent autocorrelation in challenging scenarios.

Abstract

Driving facial expressions is critical for digital animation, yet resource-constrained systems still rely heavily on linear blendshape models. Existing methods struggle with feature redundancy in high-dimensional landmark prediction and numerical instability in extreme expression regions. To address these, we propose Attention-Gate ContextNet (AtG-ContextNet),a regression framework that integrates a temporal attention mechanism with a Hybrid Gating System. AtG-ContextNet utilizes a region-aware autoencoder to encode 468 landmarks into a 32-dimensional latent vector, followed by an eight-head attention mechanism to capture long-range dependencies across 12-frame sequences. The Hybrid Gating System dynamically fuses multiplicative scaling and additive shifting via a learnable coefficient, while first-order derivative compensation ensures temporal coherence. Additionally, a clamping operator and saturation-resistant loss are employed to stabilize boundary regions. After fine-tuning on each target subset (C1-C3) of the 300-VW dataset.After fine-tuning on each target subset (C1-C3) of the 300-VW dataset, AtG-ContextNet achieved fine-tuned R2 values of up to 0.935 and a PSNR exceeding 26 dB, outperforming baselines such as Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting (Crossformer), Sparsely-Gated Mixture-of-Experts (MoE), Topologically Consistent Reweighting for XGBoost (TCR-XGBoost), and DLinear across all training subsets.On the training set, AtG-ContextNet achieved a DW statistic of 1.97, indicating negligible residual autocorrelation and confirming unbiased estimation. However, subset-specific evaluations revealed degraded DW values (1.0–1.2) across C1–C3, suggesting persistent positive autocorrelation in challenging scenarios. While ablation studies validate the architectural components, limitations in gate activation granularity and sparsity control remain to be addressed. Furthermore, visual contrast validation provides further intuitive evidence for the model's stability in expression parameter mapping.Future work will focus on optimizing sparse representations and incorporating biomechanical constraints to further enhance physical plausibility and performance.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yi Ru-Ya Zhang

Junting Qian

Bangkokthonburi University

Qian Zhang

Minzu University of China

Journals

Journal of King Saud University - Computer and Information Sciences

Actions

Institutions

Minzu University of China

Suzhou Polytechnic Institute of Agriculture

Bangkokthonburi University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

AtG-ContextNet: a temporal attention and hybrid gating architecture for facial blendshape coefficient regression

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study