March 3, 2026Open Access

Alarm-Budgeted Event-Level Evaluation Of Ics Anomaly Detection: Lessons From Swat And Wadi

Key Points

The examination reveals that no anomaly detector consistently dominates rankings due to varying alert budgets.
Miss rates for detectors on the SWaT benchmark ranged from 0.62 to 0.97 under matched alert conditions.
Evaluation included six unsupervised detectors across two distinct water-sector benchmarks for clarity in results.
Findings emphasize caution in interpreting performance differences, particularly in scenarios with constrained workloads.

Abstract

Industrial control systems (ICS) operate critical infrastructure; however, anomaly detectors for these environments are typically reported using unconstrained, threshold-free metrics. Operators have a limited capacity during deployment. A detector that appears strong under such metrics can be unusable if it generates too many alarm events. This study examines budgeted, event-level evaluations of industrial anomaly detection. Each detector was calibrated on the nominal stream to satisfy the target alarm event rate (events/h) with a duty-cycle guardrail. An attacked stream is then evaluated, reporting the event-level missed-incident rate and detection delay using attack-interval uncertainty. Six representative unsupervised and weakly supervised detectors are compared on two water-sector benchmarks (a water-treatment testbed and a water-distribution testbed). Under matched alert budgets, no detector dominated consistently: on SWaT, Miss rates ranged from 0.62 to 0.97 at B = 0.5, while on WADI some methods achieved near-zero Miss but incurred over 10× the intended alert workload under attack. Accordingly, rankings that appear clear under unconstrained metrics can be reversed once workload constraints are enforced. The study further provides Pareto trade–off analysis, a stakeholder–weighted decision layer for selecting actionable operating points, and sensitivity checks for common alarm post–processing choices. These analyses caution against over–interpreting marginal performance differences, particularly in small–attack or intermediate–budget regimes.

Alarm-Budgeted Event-Level Evaluation Of Ics Anomaly Detection: Lessons From Swat And Wadi

Key Points

Abstract

Cite This Study