What type of study is this?

This is a Quantitative Study study.

October 22, 2025

An Efficient Insider Threat Detection Framework Using Bayesian‐Optimized XGBoost

Key Points

The proposed framework achieves 99.0% accuracy on the r4.2 dataset, indicating high detection capability.
Utilizing Bayesian optimization, the model enhances XGBoost efficiency while addressing dataset imbalances.
Feature engineering captures behavioral and temporal patterns, leading to improved recall and precision metrics.
Strong performance across multiple datasets supports the robustness and reliability of the insider threat detection method.

Abstract

ABSTRACT Insider threats remain one of the most challenging issues in cybersecurity, as malicious activities are carried out by legitimate users and are difficult to distinguish from normal behavior. The rarity of insider events further leads to highly imbalanced datasets, reducing the effectiveness of conventional rule‐based, machine learning, and deep learning approaches, which often suffer from low precision and high false positive rates. This work proposes an insider threat detection framework based on Extreme Gradient Boosting (XGBoost) optimized with Bayesian Optimization (BO). Class imbalance is addressed using Synthetic Minority Oversampling Technique with Edited Nearest Neighbors (SMOTEENN). The framework is further strengthened through feature engineering to capture behavioral and temporal patterns of user activity. The proposed methodology is assessed on Carnegie Mellon University's (CMU) CERTr4.2 synthetic dataset, where single‐day sequential activity logs are processed to obtain numerical feature vectors. The model is trained on r4.2 and subsequently evaluated not only on r4.2 but also tested for generalization on the newer r5.2 and r6.2 datasets. Performance is measured under both balanced and imbalanced configurations across different data ratios. The results consistently demonstrate that feature engineering significantly improves detection capability. In particular, when evaluated on r4.2, the model achieves 99.0% accuracy, 96.2% precision, 96.9% recall, 96.6% F1‐score, and a ROC‐AUC of 99.7%. Comparable robustness is observed on r5.2 and r6.2, confirming the reliability and transferability of the approach across datasets. These findings establish the clear advantage of the proposed framework over current baseline models.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Ambairam Muthu Sivakrishna

R. Mohan

Valaparla Rohini

Journals

Security and Privacy

Actions

Institutions

National Institute of Technology Tiruchirappalli

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

An Efficient Insider Threat Detection Framework Using Bayesian‐Optimized XGBoost

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study