The construction industry remains one of the most hazardous sectors, with a high incidence of injuries and fatalities, making accurate accident prediction essential for improving safety performance. Although machine learning and deep learning approaches have been widely applied to construction accident prediction, most prior studies have primarily focused on optimizing predictive accuracy within structured modeling pipelines under internal validation settings. In contrast, the application of Generative Artificial Intelligence (Generative AI) for accident prediction remains relatively underexplored, and systematic comparisons between Generative AI and Automated Machine Learning (AutoML), particularly under standardized and external validation conditions, are limited. To address this research gap, this study provides a structured comparative evaluation of AutoML and a fine-tuned Generative Pre-trained Transformer (GPT) model in terms of predictive performance, training efficiency, robustness under external validation, and operational usability. A dataset comprising construction accident cases obtained from Korea’s Construction Safety Management Integrated Information (CSI) was used. AutoML was employed to evaluate multiple machine learning classifiers, while a GPT-based model was fine-tuned to classify accident severity. Model performance was assessed using accuracy, precision, recall, and F1-score metrics. The results indicate that AutoML achieved higher predictive accuracy (97.48%) under controlled training conditions, whereas the Generative AI model achieved 75.6%. However, AutoML required substantial preprocessing and optimization efforts. In contrast, the GPT-based model demonstrated greater deployment flexibility with minimal data preparation. External validation using newly observed imbalanced data revealed that AutoML experienced performance degradation, whereas the Generative AI model maintained relatively stable performance. These findings suggest that Generative AI may serve as a complementary and deployment-friendly alternative in construction accident prediction contexts where adaptability, external validation robustness, and usability are prioritized.
Seo et al. (Mon,) studied this question.