Industrial control systems (ICS) operate critical infrastructure; however, anomaly detectors for these environments are typically reported using unconstrained, threshold-free metrics. Operators have a limited capacity during deployment. A detector that appears strong under such metrics can be unusable if it generates too many alarm events. This study examines budgeted, event-level evaluations of industrial anomaly detection. Each detector was calibrated on the nominal stream to satisfy the target alarm event rate (events/h) with a duty-cycle guardrail. An attacked stream is then evaluated, reporting the event-level missed-incident rate and detection delay using attack-interval uncertainty. Six representative unsupervised and weakly supervised detectors are compared on two water-sector benchmarks (a water-treatment testbed and a water-distribution testbed). Under matched alert budgets, no detector dominated consistently: on SWaT, Miss rates ranged from 0.62 to 0.97 at B = 0.5, while on WADI some methods achieved near-zero Miss but incurred over 10× the intended alert workload under attack. Accordingly, rankings that appear clear under unconstrained metrics can be reversed once workload constraints are enforced. The study further provides Pareto trade–off analysis, a stakeholder–weighted decision layer for selecting actionable operating points, and sensitivity checks for common alarm post–processing choices. These analyses caution against over–interpreting marginal performance differences, particularly in small–attack or intermediate–budget regimes.
Heydari et al. (Thu,) studied this question.