This article is positioned into a cross-section of machine learning, cybersecurity, and nature-inspired domains. This article’s main objective is to use eXtended Classifier System (XCS), a known adaptive Reinforcement Learning (RL) algorithm, and alter it to use the Bacterial Foraging Optimization Algorithm (BFOA) instead of its original Genetic Algorithm component. This modification transforms XCS into a multi-criteria optimization system (BFOA-XCS) through evaluation of classifier fitness across accuracy, stability, and variance reduction while simultaneously using weighted-sum scalarization. In this way, the method leverages BFOA’s chemotactic search and population dynamics. The proposed BFOA-XCS integration was validated in two experimental phases. First, evaluations across 19 benchmark machine learning datasets demonstrated that Improved BFOA (IBFOA)-XCS achieves the best Friedman ranking among all XCS variants (marginally significant at α = 0.10, supported by medium-to-large effect sizes), with notable variance reduction (15.2 percent) over standard GA-XCS. Second, in a dynamic cybersecurity simulation environment with six attack scenarios, all XCS variants significantly outperformed three of five deep RL baselines (Deep Q-Network (DQN), Q-Learning, and Policy Gradient (REINFORCE)) with large statistical effect sizes. Proximal Policy Optimization (PPO) and Soft Actor–Critic (SAC) achieved higher overall rewards but at substantially greater computational cost: PPO at 5.3× and SAC at 26.1× the XCS compute time per run (2 min 8 s and 10 min 26 s, respectively, vs. 24 s for XCS). The results demonstrate that rule-based XCS with BFOA optimization offers a compelling alternative to neural approaches for cybersecurity defense, combining competitive performance with interpretable policies and substantially lower computational requirements.
Novak et al. (Tue,) studied this question.