Abstract In recent years, insider threat incidents have occurred with increasing frequency, leading to severe data breaches and substantial economic losses. Most existing insider threat detection methods rely primarily on single-modal features, such as system logs and registry data, while failing to fully exploit the rich semantic information embedded in instant messaging and email content of insider users. To address the above issues, we propose FusionITD, a cross-modal insider threat perception enhancement framework based on the fusion of behavioral and semantic features. This framework combines users’ temporal behavioral characteristics such as file operations and login device patterns with the semantic information derived from web browsing and email content. By modeling user behavior baselines from multiple dimensions, FusionITD enables more accurate anomaly detection when deviations from the baseline occur. Firstly, based on the temporal distribution of user behaviors, the behavior data is segmented and aggregated according to the time window to form a user behavior graph. We propose WR-GNN based on graph representation learning to capture temporal behavioral features, and introduce the Focal MSE loss function to address the data imbalance problem caused by sparse abnormal behavior data. Secondly, we propose a retrieval-augmented generation-based semantic analysis algorithm. We use cosine similarity to perform semantic matching and ranking between behavioral contents and historical behaviors. We extract features such as emotion, intention, and focus to achieve fine-grained anomaly detection for user behavior. Finally, we designed an adaptive weighting mechanism based on logistic regression to dynamically integrate the outputs of the previous two parts, enhancing the generalization ability for different threat scenarios. Experimental results conducted on the CERT datasets show that FusionITD outperforms other methods by achieving a 5% increase in AUC, a higher TPR, and a lower false positive rate.
Yuan et al. (Mon,) studied this question.