Identifying architecturally relevant entities in textual artifacts is crucial for Traceability Link Recovery (TLR) between Software Architecture Documentation (SAD) and source code. While Software Architecture Models (SAMs) can bridge the semantic gap between these artifacts, their manual creation is time-consuming. Large Language Models (LLMs) offer new capabilities for extracting architectural entities to construct SAMs automatically or establish direct trace links. This paper extends our ICSA 2025 paper, which introduced ExArch for LLM-based architecture component name extraction, by contributing the novel ArTEMiS approach, an extended evaluation, and a combined evaluation of both approaches. ExArch extracts component names as simple SAMs from SAD and source code, while ArTEMiS identifies architectural entities in documentation and matches them with SAM entities. Our evaluation compares against state-of-the-art approaches SWATTR, TransArC, and ArDoCode. TransArC achieves strong performance (F1: 0.87) but requires manually created SAMs; ExArch achieves comparable results (F1: 0.86) using only SAD and code. ArTEMiS matches SWATTR (F1: 0.81) and can replace it when integrated with TransArC. The combination of ArTEMiS and ExArch outperforms ArDoCode, the best baseline without manual SAMs. Our results demonstrate that LLMs can effectively enable automated SAM generation and TLR, making architecture-code traceability more practical and accessible.
Building similarity graph...
Analyzing shared references across papers
Loading...
Dominik Fuchß
Haoyu Liu
Sophie Corallo
Karlsruhe Institute of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Fuchß et al. (Thu,) studied this question.
synapsesocial.com/papers/69d896166c1944d70ce075fa — DOI: https://doi.org/10.5445/ir/1000191991
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: