What question did this study set out to answer?

To enhance the alignment between linguistic commands and game features in procedural content generation using reinforcement learning.

April 23, 2026Open Access

Multi-task procedural content generation with reinforcement learning

Key Points

To enhance the alignment between linguistic commands and game features in procedural content generation using reinforcement learning.
Designed a multi-task language-based framework utilizing a DeBERTa encoder.
Applied a multi-objective training scheme with approaches like regression and contrastive alignment.
Created a structured dataset of over 14,000 command-level pairs for evaluation in various contexts.
The proposed model outperformed BERT-based methods in command following.
Showed improved semantic stability of generated levels.
Achieved higher structural diversity in generated game environments.

Abstract

This paper presents a multi-task language-based framework for procedural content generation through reinforcement learning, which aims to improve the semantic alignment between linguistic commands and quantitative game surface features. While most previous methods in PCGRL have relied on numerical conditioning, the proposed approach, using a DeBERTa encoder and a multi-objective training scheme including regression, contrastive alignment, and hybrid learning, attempts to extract meaningful, generalizable, and structured representations of natural commands. To evaluate this framework, a structured dataset consisting of over 14,000 command-level pairs in the Super Mario environment is designed, which allows for the examination of single-task, collective, combinatorial, paraphrase, and extra-domain generalization. Experimental results show that the proposed model outperforms BERT-based methods in command following, semantic stability, and structural diversity of generated levels. The findings show that separating the semantic components of language and multi-objective training can be an effective step towards producing controllable, interpretable content that is aligned with human intent in PCGRL systems.

Bookmark

View Full Paper

Cite This Study

Nekahdari et al. (Mon,) studied this question.

synapsesocial.com/papers/69e9b62685696592c86eaef2 https://doi.org/https://doi.org/10.1038/s41598-026-48234-7

Bookmark

View Full Paper