Effective assessment development requires collaboration between multidisciplinary team members, and the process is often time-intensive. This study illustrates a framework for integrating generative artificial intelligence (GenAI) as a collaborator in assessment design, rather than a fully automated tool. The context was the development of a 12-item multiple-choice test for social work interns in a school-based training program, guided by design-based research (DBR) principles. Using ChatGPT to generate draft items, psychometricians refined outputs through structured prompts and then convened a panel of five subject matter experts to evaluate content validity. Results showed that while most AI-assisted items were relevant, 75% required modification, with revisions focused on response option clarity, alignment with learning objectives, and item stems. These findings provide initial evidence that GenAI can serve as a productive collaborator in assessment development when embedded in a human-in-the-loop process, while underscoring the need for continued expert oversight and further validation research.
May et al. (Thu,) studied this question.