What question did this study set out to answer?

December 20, 2025

Comparative Evaluation of Ensemble and Tree-Based Machine Learning Algorithms for Network Intrusion Detection

Key Points

To compare the performance of ensemble and tree-based machine learning algorithms for network intrusion detection.
Applied four ML algorithms: Random Forest, Decision Tree, XGBoost, and LightGBM.
Utilized the CIC-IDS-2017 benchmark dataset and implemented a preprocessing framework.
Evaluated model performance using classification metrics, focusing on macro-averaged F1-score for class imbalance.
XGBoost achieved the highest overall accuracy of 99.89% and a macro-averaged F1-score of 0.8903.
Random Forest, Decision Tree, and LightGBM also exceeded 99.8% accuracy levels.
Significant performance differences were found among the algorithms, especially in minority attack detection.

Abstract

The increasing sophistication and scale of malicious network activities demand a fundamental shift from traditional signature-based intrusion detection systems toward adaptive, data-driven security architectures. Machine learning (ML) provides an effective paradigm for addressing this challenge by identifying intricate and non-linear patterns associated with cyber threats within complex, high-dimensional network data. This study presents a comprehensive comparative analysis of four widely used ensemble and tree-based ML algorithms Random Forest (RF), Decision Tree (DT), XGBoost, and LightGBM applied to the multi-class classification of contemporary network intrusions. Using the benchmark CIC-IDS-2017 dataset, a meticulous preprocessing framework was implemented to ensure data integrity, reproducibility, and methodological rigor. Model performance was evaluated through standard classification metrics, with macro-averaged F1-score prioritized to provide an equitable assessment across highly imbalanced class distributions. Experimental findings reveal substantial differences in performance among the examined algorithms. Although RF, DT, and LightGBM achieved overall accuracy levels exceeding 99.8%, XGBoost consistently demonstrated superior capability in identifying minority attack categories, achieving the highest overall accuracy of 99.89% and a macro-averaged F1-score of 0.8903. These results highlight XGBoost’s enhanced generalization capacity and resilience to class imbalance, confirming its suitability for deployment in real-time cybersecurity environments. In conclusion, this research establishes a consistent methodological benchmark for evaluating ensemble-based intrusion detection algorithms. It underscores the critical importance of balanced model assessment in the context of skewed network traffic distributions. The findings suggest that XGBoost offers the most reliable and balanced performance profile for practical implementation within modern Security Operations Centers (SOCs), providing a strong foundation for adaptive and intelligent intrusion detection frameworks.

Bookmark

Cite This Study

Özçelebi et al. (Tue,) studied this question.

synapsesocial.com/papers/6945e9325151ab1219e4d6ce https://doi.org/https://doi.org/10.30564/jeis.v7i2.12299

Also Consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Bookmark