Abstract Understanding how groups regulate their learning together requires more attention to the verbal and nonverbal cues that shape collaborative activity. This study investigates how gaze and proxemics share and signal socially shared regulation of learning during a classroom-based collaborative inquiry task. In their natural classroom environment, 62 secondary school students worked in small groups during a physics task while video and audio were recorded. Gaze and proxemic behaviors were extracted from standard two-dimensional video through automated computer vision techniques, and challenge and regulation processes were identified using the trigger regulation framework. Transmodal ordered network analysis was then used to examine the temporal relationships among embodied cues and regulatory processes across different spatial configurations and artificial intelligence support conditions. The results show that gaze and proxemics act as functional components of regulation. When groups were physically distant, mutual gaze signaled emerging challenges and preceded monitoring. When groups were physically close, joint attention supported transitions from monitoring to selecting and enacting strategies. Adaptive artificial intelligence support strengthened cycles of shared monitoring, while static support produced more procedural patterns of strategy use. The findings advance understanding of embodied regulation in authentic classrooms and demonstrate a nonintrusive methodological approach for investigating multimodal in situ collaborative learning.
Whitehead et al. (Wed,) studied this question.