GANs in the Panorama of Synthetic Data Generation Methods

Key Points

Key points are not available for this paper at this time.

Abstract

This paper focuses on the creation and evaluation of synthetic data to address the challenges of imbalanced datasets in machine learning applications (ML), using fake news detection as a case study. We conducted a thorough literature review on generative adversarial networks (GANs) for tabular data, synthetic data generation methods, and synthetic data quality assessment. By augmenting a public news dataset with synthetic data generated by different GAN architectures, we demonstrate the potential of synthetic data to improve ML models’ performance in fake news detection. Our results show a significant improvement in classification performance, especially in the underrepresented class. We also modify and extend a data usage approach to evaluate the quality of synthetic data and investigate the relationship between synthetic data quality and data augmentation performance in classification tasks. We found a positive correlation between synthetic data quality and performance in the underrepresented class, highlighting the importance of high-quality synthetic data for effective data augmentation.

Read Full Paperexternally

Bookmark

View Full Paper

Cite This Study

Vaz et al. (Wed,) studied this question.

synapsesocial.com/papers/68e6f97db6db6435876744eb https://doi.org/https://doi.org/10.1145/3657294

Bookmark

View Full Paper