To construct a framework for annotating the process of understanding the meaning of gestures in spoken conversationsin a straightforward and versatile way, we conducted sequential annotations of embodied actions on movement scenes froma multimodal corpus of conversations between science communicators and visitors at the National Museum of EmergingScience and Innovation (Miraikan SC corpus). This paper introduces the purpose and outline of the sequential annotationsof embodied actions and presents the results of quantitative and qualitative analyses based on these annotations. For asequence in which a science communicator (SC) prompts a visitor to move using some movement or utterance and the visitorfollows and begins to move, the SC’s first action and the visitor’s second action were annotated. The SC’s first actions wereannotated as walking, pointing, changing body orientation, speech, and/or gestures. The quantitative analysis results suggestthat by reviewing the annotation data, it is possible to identify typical patterns of movement and utterance combinations thatconstitute the “gestures” that the SC used to prompt visitors to move. Meanwhile, the qualitative analysis results suggestthat a detailed analysis of cases with a significant time lag between the onset of the first and second actions would enable theidentification of the relevance and diversity of the causes of the time lag.
Sakaida et al. (Sat,) studied this question.