October 11, 2025

Language Models as Architectural Gatekeepers: Automating Conformance Checking from Natural Language

Key Points

Generated design tests executed successfully, with 96.67% demonstrating viability and accuracy in confirmation.
Findings indicated that 63.33% of tests correctly asserted expected behaviors, indicating a strong performance.
Preliminary empirical study explored whether LLMs can transform natural language into executable design tests.
Simplifying conformance checking may help reduce reliance on manual formal specifications and speed up development.

Abstract

Ensuring that implemented code adheres to its intended architecture (architectural conformance) remains a critical challenge in software engineering. While formal verification tools exist, their use is hindered by the overhead of explicitly defining formal architectural specifications. At the same time, valuable architectural decisions and design constraints are often embedded in informal channels, such as pull request discussions and issue trackers, where they are expressed in natural language rather than formal specifications. In this paper, we explore the potential of Large Language Models (LLMs) to bridge this gap by investigating whether they can transform informal design rules, based on real development discussions on GitHub, into design tests — executable conformance checkers. Rather than relying on formal models, our approach focuses on deriving design tests from implicit architectural decisions discussed in natural language. To investigate this, we conducted a preliminary empirical study to generate 30 design tests representing rules for 6 design patterns. Our results show the potential of such a strategy, as 96.67% of the generated design tests execute successfully. Furthermore, 63.33% correctly assert expected behaviors, and 76.67% accurately reflect the intended architectural rules. This approach has the potential to simplify the conformance checking process by reducing the need to manually write formal specifications and tests. By leveraging existing development discussions, it makes the process more accessible and less time-consuming, and supports early identification of architectural violations during development.

Bookmark

Language Models as Architectural Gatekeepers: Automating Conformance Checking from Natural Language

Key Points

Abstract

Cite This Study