What type of study is this?

This is a Quantitative Study study.

September 24, 2025Open Access

Game Reasoning Arena: A Framework and Benchmark for Assessing Reasoning Capabilities of Large Language Models via Game Play

Key Points

This framework enables systematic comparisons of decision-making abilities in large language models using board games.
The library supports various agent types including random and reinforcement learning agents, offering diverse game scenarios.
API access to models via liteLLM and local deployment via vLLM enhances usability and execution in the library.
Contributes to empirical evaluation of reasoning in large language models and understanding of game theoretic behaviour.

Abstract

The Game Reasoning Arena library provides a framework for evaluating the decision making abilities of large language models (LLMs) through strategic board games implemented in Google OpenSpiel library. The framework enables systematic comparisons between LLM based agents and other agents (random, heuristic, reinforcement learning agents, etc.) in various game scenarios by wrapping multiple board and matrix games and supporting different agent types. It integrates API access to models via liteLLM, local model deployment via vLLM, and offers distributed execution through Ray. This paper summarises the library structure, key characteristics, and motivation of the repository, highlighting how it contributes to the empirical evaluation of the reasoning of LLM and game theoretic behaviour.

Read Full Paperexternally

AI에게 질문

Bookmark

View Full Paper

Cite This Study

Cipolina-Kun et al. (Tue,) studied this question.

synapsesocial.com/papers/68d6e16f8b2b6861e4c4004b https://doi.org/https://doi.org/10.48550/arxiv.2508.03368

AI에게 질문

Bookmark

View Full Paper