Retrieval-Augmented Generation (RAG) systems are increasingly deployed as creative assistance in collaborative, knowledge-intensive work – intervening directly in Human-Centered Design processes, particularly during ideation and evaluation. Existing evaluation approaches focus on system-level performance parameters and content-related quality metrics, while neglecting how these properties shape the experienced human agency of individuals and teams: their capacity to define problems, generate ideas, and retain decision-making authority. This paper proposes a three-layer evaluation framework linking (1) system layer design decisions, (2) content-related quality metrics (Answer Relevance, Context Relevancy, Context Coverage, Faithfulness), and (3) a perception layer focusing on human agency. While system and content layers are necessary conditions for functional AI support, they are not sufficient: only the perception layer reveals whether RAG systems genuinely expand creative space or gradually erode human authorship and decision-making power. An upcoming scenario-based pilot study is designed to empirically examine the relationships between system configuration, output profiles, and experienced human agency in collaborative design work.
Reichardt et al. (Thu,) studied this question.