What question did this study set out to answer?

The study aims to create an efficient AI framework that analyzes mythological content in video games.

June 12, 2026Open Access

A Multimodal AI Framework for the Analysis of Mythological Narratives in Video Games Using Computer Vision, NLP and Retrieval-Augmented Generation

Key Points

The study aims to create an efficient AI framework that analyzes mythological content in video games.
Integrated computer vision, NLP, and retrieval-augmented generation methods.
Analyzed visual content, textual documentation, and gameplay video data.
Emphasized transparency and reproducibility with open-source models.
The framework successfully identified and extracted entities and themes from diverse data sources.
Demonstrated improved analytical depth and computational efficiency in multimedia environments.
Facilitated the creation of a multimodal database for mixed-methods research applications.

Abstract

This study proposes a practical, low-resource artificial intelligence methodological framework that integrates computer vision, vision-language models, natural language processing, and retrieval-augmented generation to identify, extract, and analyse mythological content embedded within contemporary video games. The methodology examines visual content (such as cover art and gameplay screenshots), textual documentation (including game manuals), and gameplay video data (comprising player-generated footage and emulated gameplay). This approach facilitates the extraction of entities, themes, and narrative structures whilst balancing analytical depth, scalability, and computational efficiency. Eight complementary techniques are tested across these three modalities, with particular emphasis placed on transparency and reproducibility through the use of open-source models and automated text-processing pipelines. Particular emphasis is placed on transparency and reproducibility through the use of open-source or freely accessible multimodal models and automated text-processing pipelines. By combining visual understanding, semantic retrieval, and entity extraction techniques, the framework enables the integration of heterogeneous data sources into a unified analytical workflow. The findings further highlight the potential of emerging multimodal AI technologies for the automated analysis of cultural, historical, and narrative content across complex multimedia environments. Ultimately, this methodological pipeline facilitates the construction of a comprehensive multimodal database—encompassing visual, textual, and video assets—that serves as the foundation for reproducible mixed-methods research, as applied in the author’s doctoral thesis, ῾Ο ἐπιζών: The Survival of Homeric Herakles in Video Games.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Rita Tegon (Thu,) studied this question.

synapsesocial.com/papers/6a2ba58f8101cf8926f03738 https://doi.org/https://doi.org/10.5281/zenodo.20628084

Bookmark

View Full Paper