Security Operations Center (SOC) platforms are racing toward an "AI-native" model in which a Large Language Model (LLM) reads alerts and logs, decides, and—increasingly—acts: it blocks IP addresses, isolates hosts, and pushes firewall rules to endpoint agents without a human in the loop. Existing work on prompt injection against SOCs stops at the decision/output layer: it makes the LLM mislabel, under-summarize, or misadvise. We argue that the higher-consequence and largely unstudied threat lies one layer deeper, at enforcement: attacker-controlled log content can induce an autonomous SOC to execute a harmful privileged action. We name this attack class actuator hijacking and define a threat model in which the attacker has no SOC access whatsoever—only the ability to generate logged activity on a monitored endpoint, which is intrinsic to attacking. We propose a five-category taxonomy (detection suppression, adversarial enforcement / self-DoS, blast-radius leverage, incident-integrity manipulation, and context bleed) and present, to our knowledge, the first end-to-end demonstration of actuator hijacking on a real deployed detection-and-response system, where the log → LLM → firewall loop is closed. As a constructive defense we introduce provenance-bound actuator gating: every argument of every privileged action must derive from a trusted, provenance-tagged parsed field, never from free-text reasoning contaminated by log content. On an automated adversarial corpus of eight scenarios across four attack categories, an undefended configuration is hijacked in 3/8 cases—including a self-DoS that blocks the network gateway—whereas the full defense reduces post-enforcement hijacks to 0/8. We report results with the honest caveats of a single-model, single-deployment case study, and release our defense pattern while withholding weaponizable payload detail.
Xuan Dong Nguyen (Wed,) studied this question.