In this paper, machine learning techniques and risk factor analyses are applied to a marine hazard potential map of the Taiwan Strait. The waters surrounding Taiwan are characterized by dense maritime traffic, including commercial cargo transportation and fishing operations. Marine accidents caused by severe weather conditions are frequently reported, leading to irreversible loss of life and property. To mitigate these risks, this study utilizes the XGBoost machine learning model in conjunction with oceanic parameters and historical accident statistics to map the risk potential distribution of maritime accidents across the Taiwan Strait on a monthly basis. To address the challenge of limited historical accident data, this research employs a TVAE (Tabular Variational Autoencoder) to generate synthetic maritime accident data. The quality of such synthetic data is evaluated by comparing the similarity of probability distributions between the original and synthetic datasets. The resulting risk potential maps indicate that risk levels are significantly higher during the winter and lower during the summer. Furthermore, the SHAP (SHapley Additive exPlanations) model is applied to analyze key risk factors, identifying wave height as the primary driver, followed by meridional (north–south) wind speed and the primary spatial modes of wave height. These findings are validated using the National Ocean Database and Sharing System (NODASS) data, providing a comprehensive explanation of the underlying physical mechanisms. This study has successfully utilized the XGBoost machine learning model together with the TVAE generative technique to develop monthly marine hazard potential distribution maps for the Taiwan Strait. The novel research flowchart employed in this study can be applied to many other marine problems.
Su et al. (Fri,) studied this question.