Evaluating intrusion detection methods at the level of individual MITRE Adversarial Tactics, Techniques, and Common Knowledge (ATT&CK) for Industrial Control System techniques requires Operational Technology traffic in which each attack sequence carries its MITRE technique identifier as ground truth. Publicly available Industrial Control System datasets either provide coarse attack-versus-benign labels (SWaT, WADI, CIC-APT-IIoT) or require ex-post technique reconstruction from CALDERA operation logs, and therefore do not support per-technique benchmarking. We describe one primary contribution and two supporting contributions, demonstrated on one Modbus/Raspberry-Pi programmable logic controller/CALDERA/convolutional bidirectional Long Short-Term Memory autoencoder (CNN-BiLSTM-AE) use case. The primary contribution is an in-orchestrator labelling methodology for per-technique-labelled Industrial Control System attack capture. Its single load-bearing property is that the campaign orchestrator owns the label primitive and writes each per-sequence technique identifier into the capture artefact at injection time, eliminating ex-post log-to-packet alignment. The first supporting contribution is a protocol-aware detection pipeline. Its load-bearing architectural choice is a priority-ordered protocol router that dispatches each labelled flow to a per-protocol detector plug-in (protocol-aware features here, with generic-flow features admissible as an alternative plug-in policy on the same router). The second supporting contribution is a suite of four reproducible CALDERA chains (three Information-Technology-to-Operational-Technology kill chains plus one enterprise-side control) that exercise the labelling methodology end-to-end and the detection pipeline along complementary detection paths. All three contributions are platform-independent: any ATT&CK-aligned emulator and any fieldbus protocol can host the labelling methodology, and any detector trained on an admissible feature space can plug into the router. The dataset contains 40,000 benign and 9997 attack Modbus sequences spanning four ATT&CK techniques (T0802 Automated Collection, T0831 Manipulation of Control, T0836 Modify Parameter, T0846 Remote System Discovery). On this dataset, the CNN-BiLSTM-AE reaches a 100% true-positive rate (TPR) at the 98th-percentile benign threshold across all four techniques and a 99.7% overall TPR at the tighter 99.5th-percentile threshold, with per-technique TPR between 96.1% (T0836 Modify Parameter) and 100% (T0802 Automated Collection, T0846 Remote System Discovery). Across the four CALDERA chains, the Modbus autoencoder produces 234 protocol-layer detections and the Security Information and Event Management (SIEM) rule set produces 30 alerts, with per-chain tactic coverage between 0.714 and 0.786 and CALDERA-ability success rates between 0.800 and 0.857.
Rahmani et al. (Tue,) studied this question.