What question did this study set out to answer?

The aim is to design and implement an AI-based cybersecurity system that proactively detects and prevents cyber threats.

May 4, 2026Open Access

Cyber Attack Prediction From Traditional Machine Learning to Generative Artificial Intelligence

Key Points

The aim is to design and implement an AI-based cybersecurity system that proactively detects and prevents cyber threats.
Trained and evaluated models using CICIDS2017 dataset.
Incorporated various classifiers: Decision Trees, Random Forest, and hybrid Voting Classifier.
Utilized generative AI techniques like Variational Autoencoder and GAN.
Voting Classifier achieved 99.6% accuracy; LSTM model reached 99.3% accuracy.
Both models effectively identified attack types like DoS and DDoS.
Framework simplified with SHAP and LIME for interpretability.

Abstract

Abstract The threats of cyber are becoming increasingly sophisticated and widespread thus we require intelligent and proactive security systems that are capable of continually identifying and anticipating network attacks. The paper presents a high-end AI-based cyber attack prediction framework, trained and evaluated on the CICIDS2017 dataset and combining approaches of ML, DL, generative AI, and explainable AI. Preprocessing is done a lot to ensure that learning is more productive. This involves elimination of missing and duplicated data, coding labels, standardization of the data and Principal Component Analysis to reduce the number of dimensions. We examine some of the ML classifiers, such as Decision Tree, RF, Extra Trees Classifier, LR, Gaussian Naive Bayes, and a hybrid Voting Classifier, which uses RF, LightGBM and XGBoost. We also examine DL networks such as CNN, LSTM, CNNLSTM and CNNLSTMGRU. Generative models such as Variational Autoencoder, Generative Adversarial Network and DistilGPT2 help improve the appearance of fake attack patterns. The best test is the Voting Classifier as it has the highest accuracy of 99.6. The second model is the LSTM which is 99.3 percent accurate. It implies that both models are capable of locating attacks of the following type: DoS, DDoS, PortScan, Bot, and Infiltration. The model is simplified with the help of LIME and SHAP. The framework is installed with Flask and allows you to log in and process data, watch what is happening and classify network traffic as good and bad. Keywords: Cyber attack prediction, Machine Learning, Deep Learning, Generative AI, Explainable AI, LSTM, Ensemble Learning, Intrusion Detection 1. Introduction The high rate of digital technology development and the fact that most of the various fields are now interrelated with one another has necessitated intense security in cybersecurity. The level of danger and intensity of cyber attacks has increased manifold as an increasing number of individuals, companies as well as governments conduct their important activities online. Ransomware, phishing attacks, Denial of Service attacks, and data leaks not only prevent the work of significant services, but they are also expensive and expose personal data 1. Traditional defensive measures mostly tend to be reactive meaning that they do not suit well in the quick world of cyber attacks. This is an indication of the value of having smart, active, and flexible security solutions 2. As a disruptive technology, AI and its offshoots, including ML, DL, NLP, and GenAI, can be used to improve cybersecurity 3,4. These technologies can find their use as predictive threat intelligence, real-time detection of anomalies, and automated mitigation procedures. This enables cybersecurity to be proactive rather than reactive. ML applications have an opportunity to identify the slightest changes in network data and can observe trends. With the help of algorithmic methods based on NLP, one can detect and classify phishing and other spam messages 3. CNNs, LSTM networks, hybrid CNNs/LSTM network, and Transformer mechanisms are all advanced models of deep learning that have performed well to discover complex attack patterns in cybersecurity 4. New threat scenarios can also be generated by generative AI models, which can be useful in preparing security systems for new attack vectors. This renders them more powerful when it comes to fighting against opponents who may alter plans 5. Despite these advances, there will still be data quality, size, model interpretability, and interoperability with existing systems 6,7. In order to make the AI-based cybersecurity systems more transparent and reliable, an increasing number of individuals are utilizing the XAI tools such as SHAP and LIME. These techniques assist analysts to make automatic predictions 8,9. These AI techniques must be implemented in a live, real-time system to develop a comprehensive cybersecurity solution that can be useful in complex and high-traffic network scenarios 10. The entire purpose of the project is to design and implement a full AI-based cybersecurity system based on ML, DL, NLP, and GenAI algorithms to detect and prevent threats before occurrence. The system resorts to preprocessing, dimensionality reduction and real time inference to enhance better detection accuracy, better decisions and to enhance the digital systems amongst the various infrastructures. This implies that they will not be followed by other new cyber threats. 2. Related Work Cybersecurity is a pertinent topic to learn as the trend is more and more individuals depending on digital systems. AI has made it a reliable instrument to the creation of defense systems with an ability to detect impending threats and react to them. In some studies, it has been emphasized that attention should be paid to ensuring that cybersecurity frameworks are based on international standards such as ISO/IEC 27000, 27001, and 27002. These standards provide systemic regulations as to the way of securing information and reducing the risk 11. Each of these standards emphasizes the need to have systematic approaches to securing the digital assets. This goes in line with AI-based solutions to locate and prevent threats. The use of AI in cybersecurity has been explored by using different ML and DL methods in enhancing threat detection, anomaly detection, and predictive analytics 12. Research indicates that conventional measures of cybersecurity such as signature-based and heuristic approaches are unable to pace with the increasing sophistications of cyberattacks. The supervised and unsupervised machine learning architectures have proved to be highly effective in monitoring network-related information, uncovering harmful patterns, and detecting zero-day attacks 13. These models are more accurate and reduce false positives when provided with large datasets, hence making them more precise. This allows being able to take more proactive measures of self-protection. On a broad scale, AI will be significant in cybersecurity because it is capable of identifying the presence of strange things and predictions. This can be achieved due to the capability of ML algorithms such as Decision Trees, RF, and SVM to distinguish the normal and abnormal network behavior, as a result of which it is simpler to detect threats at an earlier stage 14. These are the methods under which predictive cybersecurity is founded, implying that an attack can be forecasted on the basis of what has already transpired. Combinational designs that combine ML and DL models, such as CNNs and LSTM networks, also have the ability to detect complex attack flags, particularly in a large IoT and business network 16,17. XAI has emerged as a critical part of AI-based cybersecurity to address the problem of opaque black-box models that are hard to comprehend and interpret. Two examples of the methods that might assist the analysts in deciding how a model came to this conclusion, demonstrate the significance of a feature, and allow people to have confidence in automated danger forecasts are SHAP and LIME 15. It has proven that XAI methods are vulnerable to black-box attacks in malicious settings, which may cause model explanations to be inaccurate and reduce the effectiveness of cybersecurity systems 20. These findings indicate that XAI simplifies and makes it more credible, although more security controls are necessary to ensure that it is not misused by bad people. The concept of DL has recently been applied to the identification of network anomalies since it can detect patterns and time relationships in network traffic data 18. Such neural networks as CNNs, LSTMs or hybrid CNN-LSTMs have demonstrated that they can detect the presence of APTs and botnet operations with reasonable accuracy compared to other common machine learning models. They are also more efficient and are able to work with large data. They are best suited to these models operating in real-time data processing pipelines to allow you to identify and react to new cyber threats immediately. Another concern that researchers have examined is the concerns regarding cloud security since cloud usage exposes new avenues of attack and increases privacy concerns 19. It has been proved that AI-based surveillance and predictive analytics in cloud environments can identify cases of policy breaches, unauthorized access, and other suspicious activity, which is why cloud services become more resilient. Through integrating AI models with continuously learning systems, the defense systems have the potential to remain effective and responsive and respond to new and changing methods of attack. 3. Materials And Methods To improve the cybersecurity system, the proposed system relies on predictive intelligence based on ML (DT, RF, LR, Naive Bayes, Voting Classifier), Deep Learning (DNN, CNN, LSTM, CNN+LSTM, CNN+LSTM+GRU), and Generative AI (VAE, GAN, DistilGPT2). The system takes the CICIDS2017 database and adds data preparation, normalization, and dimensionality reduction based on PCA that simplifies the calculations and the features to be utilized. Fig.1:System Architecture The existing tools such as LIME and SHAP that simplify AI understanding and explanation are designed to do so. It allows the consumers to view the importance of each feature and the decisions that this model had to make 22,25. Detection of relatively subtle and evolving attack patterns are easier to detect using DL structures and in particular, CNN and LSTM hybrids. Generative AI in its turn forms new conditions of hazards to assist in predicting invasions prior to their occurrence 21. It consists of the web interface developed using Flask that allows one to enter the data, make conclusions, and view the results immediately. This cybersecurity system is an end to end design which is highly scalable, versatile and easy to use. It is capab

Read Full Paperexternally

KI fragen

Bookmark

View Full Paper