In response to the redundant spatio-temporal modeling and insufficient adaptation to dynamic decision-making in eye-tracking interaction of smart home interfaces, a smart home interface eye-tracking response optimization model based on spatio-temporal Transformer and gate control cross-attention is proposed. It adapts the physiological characteristics of eye-tracking jumps through dynamic sparse attention gating to compress computational redundancy and combines multi-objective reinforcement learning attention modulation to construct a closed-loop decision-making mechanism, optimizing interface parameters in real-time. Experiments showed that the model reduced eye-tracking trajectory prediction error by 23.7% compared to advanced benchmarks, increased the success rate of adapting to dynamic mutation scenarios to 89.2%, and controlled performance fluctuations within 2.3% under noise interference. In high-fidelity user testing, the accuracy of cross-task gaze transfer reached 93.4%, the failure rate of glare interference was optimized to 2.4%, and the user cognitive load index was reduced by 27.9%. Its resource consumption and energy consumption were reduced by 26.7% and 44.9%, respectively, while its posture deviation tolerance remained at 3.5°. The sparse spatio-temporal modeling of the spatio-temporal adaptive Transformer module and the enhanced gating mechanism of the hierarchical gated cross-attention module work together to break through the limitations of traditional methods in computational efficiency and dynamic feedback, providing high-precision and low-latency eye-tracking interaction solutions for smart home interface systems, and promoting the practical evolution of personalized human–machine collaborative control.
Lu et al. (Wed,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: