ABSTRACT Advanced cyber threats such as zero‐day exploits and sophisticated evasion techniques challenge Network Intrusion Detection Systems (NIDS). To address this, we propose a robust machine learning framework that integrates multi‐source data fusion, protocol‐aware preprocessing, and ensemble learning. Our study uses a comprehensive dataset of 12.7 million real‐world network flows (10.1M benign, 2.6M malicious) collected from enterprise environments. Our key innovation is a weighted voting ensemble—combining Logistic Regression, Decision Trees, and a 1D‐CNN—which achieves 99.8% detection accuracy while reducing false positives by 4.9% compared to individual models. The system also incorporates a lightweight adversarial aligner to counter evasion techniques (e.g., IP fragmentation, MAC spoofing), recovering up to 95% of baseline recall. Notably, under extreme class imbalance (1:99), our framework maintains 80.1% recall with only 8.2 false positives per million packets, outperforming deep learning models like LSTM and 1D‐CNN while using 100 times fewer parameters. These results demonstrate the framework's practicality for efficient, high‐throughput NIDS deployments in real‐world settings.
Shafin et al. (Tue,) studied this question.