What question did this study set out to answer?

This research aims to assess how effectively large language models can classify texts related to same-level falls in the workplace.

April 19, 2026Open Access

Automated Classification of Occupational Accident Texts Using Large Language Models: A Pilot Study

Puntos clave

This research aims to assess how effectively large language models can classify texts related to same-level falls in the workplace.
Analyzed 2619 same-level-fall-related injury cases.
Compared four large language models for text classification.
Used expert manual classification as the reference standard.
Evaluated accuracy and Cohen’s kappa for performance assessment.
The o4-mini model achieved 72.8% accuracy in classifying the 'causal agent' category.
Top models reached accuracies of 82-92% in other classification tasks.
Cohen’s kappa coefficients were greater than 0.7, indicating substantial agreement with expert classification.
Findings highlight the potential of LLMs for effective analysis of occupational accident data.

Resumen

Same-level falls are the most frequent occupational accidents, yet traditional manual analysis of accident reports is labor-intensive and limits large-scale prevention strategies. In this pilot study, we aimed to evaluate the accuracy of using large language models (LLMs) to automate the classification of occupational accident text data without task-specific pretraining. We analyzed data from 2619 same-level-fall-related injury cases, using expert manual classification as the reference standard. Four models—GPT-4o mini, GPT-4.1 mini, GPT-4.1, and o4-mini—were compared using accuracy and Cohen’s kappa. The o4-mini model demonstrated the highest performance, showing statistical superiority in the complex “causal agent” category with 72.8% accuracy. For other classification tasks, the top models achieved accuracies of 82–92%, with Cohen’s kappa coefficients > 0.7, indicating substantial agreement with expert judgments. These findings suggest that LLMs can classify occupational accident text with substantial agreement with the expert-derived reference standard in this dataset. This automated approach enables efficient, high-frequency analysis of large datasets, offering a promising tool for large-scale occupational accident surveillance and screening.

Leer artículo completoexternamente

Me gusta

Guardar

Ver artículo completo