What question did this study set out to answer?

The central aim is to improve the efficiency and accuracy of intrusion detection systems using provenance graphs.

March 2, 2026Open Access

Hybrid Time–Position Embedding for Provenance-Based Intrusion Detection

Key Points

The central aim is to improve the efficiency and accuracy of intrusion detection systems using provenance graphs.
Developed a provenance graph construction technique to generate meaningful vector representations from system logs.
Implemented a hybrid time-position embedding technique to capture causal relationships between security events.
Introduced an iterative refinement learning strategy tailored for system log data characteristics.
Demonstrated improved detection accuracy compared to existing methods.
Showed significant acceleration in convergence during the iterative training process.
Effectively captured long dwell times, a key feature of advanced persistent threat attacks.

Abstract

Provenance-based Intrusion Detection Systems (IDSs) model the causal relationships between security events through a provenance graph and learn contextual information to detect Advanced Persistent Threats (APTs) effectively. However, existing provenance graph representation methods fail to fully reflect the characteristics of security domain data and the semantic information embedded in system logs, resulting in limited learning efficiency and detection accuracy. This paper proposes a provenance representation method that effectively captures security context from system log data. The proposed method improves the performance of provenance-based IDSs by combining (1) a provenance graph construction technique that transforms meaningful string attributes—such as command lines, process names, and file paths—into vector representations to extract semantic information in the security context, (2) a hybrid time–position embedding technique for capturing causal relationships between events, and (3) an iterative refinement learning strategy tailored to the characteristics of system log data. Experimental results using the DARPA Transparent Computing Engagement 3 (E3) benchmark dataset for APT detection demonstrate that our method achieves improved accuracy compared to existing approaches while significantly accelerating convergence during iterative training. These results suggest that the proposed embedding technique can more effectively capture abnormal temporal patterns, such as the long dwell times characteristic of APT attacks.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Gong et al. (Sat,) studied this question.

synapsesocial.com/papers/69a52dd3f1e85e5c73bf0f19 https://doi.org/https://doi.org/10.3390/electronics15051004

Bookmark

View Full Paper