Data scarcity and class imbalance pose persistent challenges in cybersecurity AI, par-ticularly for intrusion detection systems, where real-world malicious network traffic is rare and sensitive. To address this, the present study explores the generation of synthetic network traffic using deep generative models, focusing on both Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs). Building upon recent advances in data synthesis, we introduce a systematic framework for Data Quality Assessment (DQA) to evaluate the realism and utility of generated malicious traffic. Our approach compares the outputs of GANs and VAEs not only in terms of statistical similarity to real attack patterns, but also by measuring their effect on the performance of super-vised/unsupervised Intrusion Detection Models. By embedding synthetic samples into the training process, we quantify improvements in classification accuracy, recall, and robustness under various threat scenarios. The outcomes of this work aim to enhance trust in synthetic data generation techniques, offering reliable augmentation strategies for cybersecurity applications under data-limited conditions.
Building similarity graph...
Analyzing shared references across papers
Loading...
Νικόλαος Πεππές
National Technical University of Athens
Theodoros Alexakis
National Technical University of Athens
Emmanouil Daskalakis
National Technical University of Athens
Building similarity graph...
Analyzing shared references across papers
Loading...
Πεππές et al. (Fri,) studied this question.
synapsesocial.com/papers/689a0945e6551bb0af8cefe0 — DOI: https://doi.org/10.20944/preprints202507.2103.v1