In today’s digital era, cyberattacks pose a critical threat to networks of all scales, from local systems to global infrastructures. Intrusion detection systems (IDSs) are essential for identifying and mitigating such threats. However, existing machine learning-based IDS often suffer from low detection accuracy, heavy reliance on manual feature extraction, and limited coverage of attack categories. To address these limitations, we propose a modular, deployment-ready intrusion detection framework that integrates multiple heterogeneous datasets through a hybrid transformer–multilayer perceptron (Transformer–MLP) architecture. The system employs three parallel Transformer–MLP models, each specialized for a distinct dataset, whose probabilistic outputs are fused using a weighted decision-level strategy. Unlike traditional feature-level fusion, this strategy ensures module independence, eliminates the need for global retraining when adding new components, and provides seamless modular scalability. The framework accurately identifies twenty-one traffic categories, including one benign and twenty attack classes, derived from a unified mapping across multiple heterogeneous sources to ensure a consistent cross-dataset taxonomy. By combining advanced contextual representation learning with ensemble-based probabilistic fusion, the framework demonstrates high detection accuracy and practical applicability in real-world network environments. The Transformer module captures complex contextual dependencies, while the MLP performs final classification. Class imbalance is mitigated via adaptive synthetic sampling (ADASYN), synthetic minority over-sampling technique (SMOTE), edited nearest neighbor (ENN), and class weight adjustments. Empirical evaluation demonstrates the framework’s high effectiveness: for binary classification, it achieves 99.98% on CICIDS2017, 99.19% on NSL-KDD, and 99.98% on NF-BoT-IoT-v2; for two-stage multi-class classification, 99.56%, 99.55%, and 97.75%; and for one-phase multi-class classification, 99.73%, 99.07%, and 98.23%, respectively. Moreover, the framework enables real-time deployment with 4.8–6.9 ms latency, 9800–14,200 fps throughput, and 412–458 MB memory. These results outperform existing multi-dataset IDS approaches, highlighting the architectural effectiveness, robustness, and practical applicability of the proposed framework.
Kamal et al. (Mon,) studied this question.