What question did this study set out to answer?

To investigate the effectiveness of large language models in automating the reconstruction of domain models in legacy systems.

March 7, 2026Open Access

AI-Driven Refactoring: Semantic Reconstruction of Domain Models Using LLM Reasoning

Key Points

To investigate the effectiveness of large language models in automating the reconstruction of domain models in legacy systems.
Utilized a disordered JSON representation to start the reconstruction process.
Employed GPT-5.2 as the LLM for diagnosis and restructuring.
Organized the domain model according to Domain-Driven Design principles.
Conducted comparative analysis using SWE-bench and LiveCodeBench to select the best LLM.
LLM successfully identified and corrected domain model issues such as misplaced methods and inconsistent naming.
Achieved semantically consistent refactoring, reducing manual effort required.
Created a categorized catalogue of corrections to assist in the model refinement.

Abstract

This study examines the application of large language models (LLMs) for automating domain layer reconstruction in legacy systems, with a specific focus on a case study involving water consumption management. The process begins with a deliberately disordered JSON representation that conflates domain, application, and infrastructure issues. An LLM, specifically GPT-5.2, was employed to identify misplaced methods, inconsistent naming, DTO misuse, incoherent aggregates, and unrelated modules, and subsequently reorganize the model into a structure aligned with Domain-Driven Design (DDD). The structure includes entities, value objects, aggregates, domain services, domain events, and repositories. The methodology involves encoding the legacy model as JSON, applying an LLM-based diagnosis and reconstruction pipeline, and producing both a refined domain model and a categorized catalogue of corrections. A comparative analysis of candidate LLMs, informed by recent code-centric benchmarks, such as SWE-bench and LiveCodeBench, supports the selection of GPT-5.2 as the primary model for this study. The findings indicate that the LLM can swiftly recover key domain concepts and achieve semantically consistent refactoring, a task that typically requires extensive manual effort. This suggests that LLM-assisted domain reconstruction is a promising adjunct to traditional refactoring practices and can facilitate continuous architectural improvements in organizations.

Bookmark

View Full Paper

Bookmark

View Full Paper

AI-Driven Refactoring: Semantic Reconstruction of Domain Models Using LLM Reasoning

Key Points

Abstract

Cite This Study