Key points are not available for this paper at this time.
Robotic Spatial Augmented Reality (RSAR) systems present a unique control challenge as their end-effector is a projection, whose final position depends on both the actuator’s pose and the external environment’s geometry. Accurately controlling this projection first requires predicting the 6-DOF pose of a projector-camera unit from joint angles; however, loose kinematic specifications in many RSAR setups make precise analytical models unavailable for this task. This study proposes a novel deep learning model combining Long Short-Term Memory (LSTM) and an Attention Mechanism (LSTM–Attention) to accurately estimate the forward kinematics of a 2-axis Pan-Tilt actuator. To ensure a fair evaluation of intrinsic model performance, a simulation framework using Unity and unified robot description format was developed to generate a noise-free benchmark dataset. The proposed model utilizes a multi-task learning architecture with a geodesic distance loss function to optimize 3-dimensional position and 4-dimensional quaternion rotation separately. Quantitative results show that the proposed LSTM–Attention model achieved the lowest errors (Position MAE: 18.00 mm; Rotation MAE: 3.723 deg), consistently outperforming baseline models like Random Forest by 9.5% and 17.6%, respectively. Qualitative analysis further confirmed its superior stability and outlier suppression. The proposed LSTM–Attention architecture proves to be a effective and accurate methodology for modeling the complex non-linear kinematics of RSAR systems.
Jang et al. (Tue,) studied this question.