Abstract
Effective training in Machine Learning and Deep Learning models necessitates datasets that provide sufficient patterns and contextual information, particularly crucial in IoT networks. Imbalanced datasets, however, significantly challenge the performance of Autonomous Intrusion Detection Systems (IDS), leading to suboptimal detection rates for minority classes. In this paper, we address this issue by utilizing various Generative Adversarial Network (GAN) models, including WGANGP, CGAN, CTGAN, and CWGANGP, to generate synthetic data that balance these imbalanced datasets. We evaluate the performance of IDS models trained on GAN-augmented datasets against those trained on unbalanced datasets, considering metrics such as fitting duration, generation duration, accuracy, precision, recall, and F1 score. Our findings reveal substantial improvements in IDS performance with the application of GANs across binary, general, and specific attack classifications. Additionally, we compare the effectiveness of GANs with classical sampling algorithms, such as SMOTE and Random Oversampling. This comprehensive evaluation underscores the potential of GANs as a sophisticated solution for improving IDS accuracy and reliability in handling complex and highly imbalanced datasets.