Abstract In the digital age, the omnipresence of software drives the mass production fueled by artificial intelligence. This dependence requires rigorous methods to assess the quality of the software. Inspired by Proficiency Tests (PT) from other fields, this work proposes an innovative process to evaluate the competence of software testing teams and laboratories. Based on Action Design Research(ADR) and Business Process Management(BPM), the developed solution includes software that automates the process, provides a knowledge base, and generates performance indicators. Furthermore, we explore the use of Large Language Models (LLMs) to extract valuable insights from PT results. Quantitative experiments demonstrated the effectiveness of the proposed process, while a qualitative study validated the solution, demonstrating its feasibility and potential to transform the quality assessment of software.
Dallilo et al. (Tue,) studied this question.