Accurate forecasts of tropical cyclone (TC) track and intensity with a sufficient lead time are critical for disaster preparedness and risk mitigation. Traditional numerical weather prediction models, while fundamental to operational forecasting, often exhibit systematic errors due to limitations in observations, physical parameterizations, and model resolution. In recent years, machine learning (ML) and deep learning (DL) approaches have emerged as promising data-driven alternatives for improving TC forecasts. This study presents a comparative evaluation of six ML and DL models—Random Forest (RF), Extreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), Artificial Neural Network (ANN), and Convolutional Neural Network (CNN)—for forecasting TC track and intensity in the North Atlantic basin. The models are trained using the National Hurricane Center’s (NHC) HURDAT2 best-track dataset for storms from 1990 to 2019 and evaluated on an independent test set from the 2020 season. Model performance is compared across all models and benchmarked against the 2020 mean Decay-SHIFOR5 intensity error, CLIPER5 track errors, and the NHC official forecast (OFCL) errors. Forecast skill is assessed using mean absolute error (MAE) with 95% bootstrap confidence intervals and the coefficient of determination (R2) across lead times of 6, 12, 18, 24, 48, and 72 h. The results show that: (1) several ML and DL models achieve intensity forecast performance that is broadly comparable in magnitude to the 2020 mean OFCL benchmarks, with an average error reduction of 5–11% at the 24 h lead time; (2) among the ML models, XGBoost and CatBoost slightly outperform LightGBM and RF in accuracy, while LightGBM demonstrates the highest computational efficiency; and (3) among the DL models, CNNs outperform ANNs in predictive accuracy and intensity forecasting efficiency, while ANNs exhibit lower computational cost for track forecast. Bootstrap confidence intervals indicate relatively low variability in model errors, supporting the statistical stability of the results within the 2020 season. However, these results reflect within-season variability and do not necessarily generalize across different years or climatological conditions. Overall, the findings demonstrate the potential of ML/DL-based approaches to complement existing operational forecast systems and enhance TC track and intensity forecasting in the North Atlantic basin.
Ogu et al. (Mon,) studied this question.