Traffic crashes involving pedestrians tend to result in the most casualties (minor, serious, or fatal). Therefore, accurately determining the level of responsibility in a pedestrian crash is crucial, as liability can lead to civil, administrative, or criminal consequences. Despite its importance, the scientific literature contains very few studies focused specially on the attribution of responsibility in traffic accidents, and even fewer focus on pedestrian collisions. This study evaluated different supervised classification models using Machine Learning (ML) techniques to classify the levels of responsibility of both drivers and pedestrians using real crash data. In this evaluation, 14 binary variables were considered based on four subsystems: human, technological, structural, and normative. The goal is to help judicial and police authorities make more efficient and objective attributions of responsibility. This involves analyzing the most influential variables after the classification process. Then, policymakers will be able to use these assessments to develop new strategies for improving road safety. The dataset consists of 510 pedestrian crashes extracted from the reports by the Local Police of Badajoz (LPB) in Spain and judicial decisions of the Spanish Judiciary (SJ). Of the models analyzed, Decision Trees (DT), Naïve Bayes (NB), and Support Vector Machine (SVM) models produced the best initial performance. These three models were then compared, and the metrics showed that the DT model is the best option. Furthermore, the feature importance analysis of the 14 variables revealed that possessing a driver's license is the most influential factor in determining responsibility (47.26%). The next most influential factors were the pedestrian's location (15.35%); driver under the influence of alcohol/drugs (7.24%); and distracted driving, e.g., using a mobile phone (7.04%).
Moreno-Sanfélix et al. (Wed,) studied this question.