March 3, 2026Open Access

Closer to Human: Hybrid Training of VR Agents Beyond Robotic Motion

Key Points

Hybrid-trained VR agents achieved stable convergence and non-repetitive behaviors, indicating enhanced adaptability.
Dynamic Time Warping and Fréchet distance were used to compare agent motion against human demonstrations with consistent results.
The framework integrates behavioral cloning and reinforcement learning to advance VR agent development effectively.
Findings suggest a foundation for applying the methodology to complex multi-joint and full-body motion learning.

Abstract

The success of virtual reality (VR) agents is not solely defined by task completion; adaptability and the ability to move beyond repetitive, robotic behavior are equally critical for embodiment, social interaction, rehabilitation, and training. To address this challenge, we propose a hybrid training methodology that integrates imitation learning and reinforcement learning to develop agents that both master tasks and generalize beyond demonstrations. Our framework combines Behavioral Cloning (BC) for rapid policy initialization, Proximal Policy Optimization (PPO) for stable reinforcement-driven learning, and Generative Adversarial Imitation Learning (GAIL) for imitation-based rewards, resulting in policies that converge efficiently while avoiding repetitive imitation. We implemented this framework in a Unity-based VR environment, where participants performed a single-joint cube pick-and-place task using a headset and controller. Human demonstrations were collected and used to train and evaluate agents against multiple motion metrics, including Dynamic Time Warping (DTW), Fréchet distance, minimum jerk cost, Fitts’s law, and curvature–velocity coupling. Experimental results showed that hybrid-trained agents achieved stable convergence, consistent task mastery, and motion trajectories comparable to human demonstrations across temporal, spatial, and smoothness dimensions. Notably, the agents exhibited adaptive, non-repetitive behavior without explicitly optimizing for biomechanical fidelity, suggesting that human-like variability can emerge naturally from our customized hybrid IL+RL training. These findings provide a foundation for scaling the methodology toward multi-joint and full-body motion learning, with promising implications for future VR applications in rehabilitation, training, and embodied interaction.

Closer to Human: Hybrid Training of VR Agents Beyond Robotic Motion

Key Points

Abstract

Cite This Study