This study proposes a methodology for anomaly detection in HIDS using supervised and semi-supervised anomaly detection approaches by applying GAN (Generative Adversarial Network) based data augmentation. An anomaly-based intrusion detection system detects abnormal patterns based on deviations from expected normal behaviors; however, such a system has a low detection rate. Also a detection accuracy may vary depending on whether abnormal samples are used during learning. Moreover, it may vary according to the degree of class imbalance that means the imbalance of data class distributions. To avoid the problem and to enhance the low predictive accuracy, it might need to augment minority datasets through the creation of new samples. Therefore, recently, some of existing studies have involved the development of intrusion detection models using machine/deep learning algorithms to overcome the limitations of existing anomaly-based intrusion detection methodologies and to avoid class imbalance problems. In a similar vein, this study proposes a method for improving classification performance of normal and abnormal data in anomaly-based intrusion detection systems by applying data augmentation using GAN. To verify the effectiveness of the proposed anomaly detection method, we use the ADFA-LD Dataset which consists of system call traces for attacks on the latest operating systems. Experiments were performed using SVM (Support Vector Machine) and CNN (Convolution Neural Network) for classification, and GAN and SMOTE for data augmentation, respectively. The experimental results indicated that GAN based approach provides a slightly more reliable way of working with data augmentation than SMOTE. In addition, it was confirmed based on the experimental results that the classification performance can be improved as the number of samples belonging to each imbalanced class increases.
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT: Ministry of Science and ICT) (No. NRF-2019R1F1A1059036).