The intrusion detection system (IDS) that uses a machine learning (ML) algorithm recognizes attack flows from normal ones using supervised, semi-supervised, or unsupervised techniques. Supervised ML (SML) IDS achieved the best detection rate when historical data was available. Therefore, various SML-IDS techniques have been proposed, combining classification algorithms with preprocessing and normalization steps. This research has two aims (1) to review the components of the SML-IDS and (2) to evaluate the alternative techniques using a multi-criteria decision-making approach with reference to positive and negative ideal alternatives. The review focuses on the algorithms, datasets, and metrics used with the SML-IDS. On the other hand, the proposed evaluation framework uses the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) and employs various evaluators’ preferences. Initially, the algorithms and the datasets are gathered, and then the evaluation criteria and their types are identified, e.g., cost vs. benefit. The weights for these criteria are initialized next, taking various preferences into account. The alternative algorithms are then evaluated, and their results are conveyed and ranked based on their distances to the ideal alternatives with references to the stored initialized weights. Three datasets are used in the evaluation process: KDD, NSL-KDD, and CICIDS2017. The results indicate that, within the experimental setup and the utilized datasets, tree-based methods (Random Tree, C4.5, and Random Forest) frequently achieved top rankings across multiple evaluator preferences. Naïve Bayes classifiers performed consistently worse across the experiments, likely reflecting their sensitivity to feature dependencies and high-dimensional distributions in the selected datasets.
Abu-Shareha et al. (Sun,) studied this question.