What does this research mean for the field?

A multimodal knowledge-graph framework integrating visual, textual, and performative data can effectively support structured cross-cultural narrative generation for shadow puppetry, achieving high semantic alignment and synchronization accuracy. Novelty: ClaimNovelty.METHODOLOGICAL. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to develop a computational framework to represent multimodal knowledge for shadow play narratives across cultures.

June 4, 2026Open Access

Cross-Cultural Interactive Generation with a Multimodal Knowledge Graph for Shadow Play Narratives

Key Points

This research aims to develop a computational framework to represent multimodal knowledge for shadow play narratives across cultures.
Developed a multimodal knowledge-graph framework using Faster R-CNN, BiLSTM-CRF, OpenPose-GCN, and a VGG16-BERT dual encoder.
Performed experiments to validate semantic alignment and synchronization accuracy.
Evaluated user retention in dynamic narratives with a tested sample.
Achieved 91.8% semantic alignment in narrative generation.
Obtained 87.6% synchronization accuracy across trials.
Dynamic narrative case yielded an 81.5% user retention rate.

Abstract

The digital preservation of intangible cultural heritage requires computational models that can jointly represent visual, textual, and performative knowledge while remaining interpretable in cross-cultural settings. Using Chinese shadow puppetry as the application domain, this study develops a multimodal knowledge-graph framework that combines Faster R-CNN, BiLSTM-CRF, OpenPose-GCN, SimplE embedding, a VGG16-BERT dual encoder, and Sinkhorn-based temporal alignment to organize image, text, and action information in a unified structure. The empirically validated contribution of the framework lies in multimodal knowledge-graph construction, cross-modal alignment, and adaptive narrative selection; generative rendering and blockchain traceability are retained as extensible system modules rather than the sole basis of the quantitative claims. In the reported experiments, the framework achieved 91.8% semantic alignment and 87.6% synchronization accuracy across repeated trials, while the dynamic narrative case reached an 81.5% user retention rate in the tested sample. These findings suggest that multimodal knowledge graphs can support structured cross-cultural narrative generation for shadow puppetry, while the current evidence should still be interpreted within the limits of the available corpus, user sample size, and partial reproducibility.

Cross-Cultural Interactive Generation with a Multimodal Knowledge Graph for Shadow Play Narratives

Key Points

Abstract

Cite This Study

Also Consider

Also Consider