• Proposed AC-MADQN, a privacy-preserving federated RL framework for autonomous IoT intrusion response. • Introduces dynamic state representation with temporal attention for improved adaptability to evolving attacks. • Ensures strong privacy and security via differential privacy, homomorphic encryption, and Byzantine-resilient aggregation. • Achieves 47.8% fewer false positives, 58.3% better threat mitigation, and 99.7% privacy preservation. • Demonstrates scalability and robustness, maintaining sub-second response and stability under 40% compromised nodes. The exponential growth of Internet of Things (IoT) devices has intensified cybersecurity challenges that demand autonomous, adaptive, and privacy-preserving intrusion response mechanisms. Traditional centralized solutions suffer from scalability bottlenecks, high communication costs, and privacy violations, making them unsuitable for modern heterogeneous IoT networks. To overcome these limitations, this paper introduces a Privacy-Preserving Federated Reinforcement Learning (PP-FRL) framework named Adaptive Contextual Multi-Agent Deep Q-Network (AC-MADQN). The framework enables distributed IoT edge devices to collaboratively learn optimal security policies without sharing raw data, combining hierarchical policy optimization with federated aggregation enhanced by quantum-resilient cryptography , differential privacy, and Byzantine fault-tolerant consensus. AC-MADQN incorporates four key innovations: (1) Dynamic State Representation Learning using temporal attention, (2) Multi-scale threat-intelligence fusion, (3) Adaptive resource-aware policy optimization, and (4) Cryptographically secure experience-replay sharing. Comprehensive experiments on a 2,000-device heterogeneous IoT testbed across 15 attack scenarios demonstrate that AC-MADQN achieves a 47.8 % reduction in false positives, a 58.3 % improvement in threat-mitigation effectiveness, and 99.7 % privacy preservation under ε = 0.01 differential privacy, maintaining sub-second latency even with 40 % Byzantine node compromise. The framework’s theoretical analysis confirms formal convergence guarantees and logarithmic communication-complexity O(N log T) improvement compared to existing federated RL methods. These results establish AC-MADQN as a scalable, resilient, and practical solution for autonomous IoT security deployments demanding both high performance and strict privacy protection.
Yaseen et al. (Thu,) studied this question.