March 11, 2024Open Access

Evaluation of a Generative Language Model Tool for Writing Examination Questions

Puntos clave

Los puntos clave no están disponibles para este artículo en este momento.

Resumen

Objective To describe an evaluation of a generative language model tool to write examination questions for a new elective course focused on the interpretation of common clinical laboratory results being developed as an elective for students in a Bachelor of Science in Pharmaceutical Sciences program. Methods One hundred multiple choice questions were generated using a publicly available large language model for a course dealing with common laboratory values. Two independent evaluators with extensive training and experience in writing multiple choice questions evaluated each question for appropriate formatting, clarity, correctness, relevancy, and difficulty. For each question, a final dichotomous judgement was assigned by each reviewer, useable as written or not usable written. Results The major finding of this study was that a generative language model (ChatGPT 3.5) could generate multiple choice questions for assessing common laboratory value information but only about half the questions (50% and 57% for the two evaluators, P=0.321) were deemed usable without modification. General agreement between evaluator comments was common (62% of comments) with more than one correct answer being the most common reason for commenting on the lack of usability (n=27). Conclusion The generally positive findings of this study suggest that the use of a generative language model tool for developing examination questions is deserving of further investigation.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo