Abstract The advent of generative artificial intelligence (AI), such as large language models (LLMs), brings a vast range of possibilities and concerns for engineering design. The speed and efficiency of generative AI software can tempt designers to use these tools for the steps of the design process that are most time and resource-intensive, such as conducting thorough user interviews. This study presents results from an experiment comparing four sets of interview data: real human interviews, designer-filtered interview data, simulated interview data from LLMs without demographic information of the interviewee, and simulated interview data from LLMs with demographic information of the interviewee. All interviews (real and artificial) used the same set of interview questions. The interviews were subsequently assessed for themes using three AI tools: BERTopic, Latent Dirichlet Allocation, and ChatGPT. These themes were then clustered into user needs by human design experts to generate a comprehensive list of user needs and to compare patterns of user needs between human and AI interviewees. We find that the AI interviews heavily rely on the questions being asked and are unable to convey to the designer if certain topics are irrelevant. On the other hand, we also find that the AI interviews can bring some areas to the designers' attention not initially present in human interviews that may be relevant to human users and could be verified via follow-up interviews with users.
Das et al. (Mon,) studied this question.