摘要
程序的系统调用信息是检测主机异常的重要数据,然而异常发生的次数相对较少,这使得收集到的系统调用数据往往存在数据不均衡的问题。较少的异常系统调用数据使得检测模型无法充分理解程序的异常行为模式,导致入侵检测的准确率较低、误报率较高。针对以上问题,提出了一种基于生成对抗网络的系统调用主机入侵检测方法,通过对异常系统调用数据的增强,缓解数据不平衡的问题。首先将程序的系统调用轨迹划分成固定长度的N-Gram序列,其次使用SeqGAN从异常数据的N-Gram序列中生成合成的N-Gram序列,生成的异常数据与原始数据集相结合,用于训练入侵检测模型。在一个主机系统调用数据集ADFA-LD及一个安卓系统调用数据集Drebin上进行了实验,所提方法的检测准确率分别为0.986和0.989,误报率分别为0.011和0,检测效果优于现有的基于混合神经网络的模型、WaveNet、Relaxed-SVM及RNN-VED的入侵检测研究方法。
The system call information of a program is an important data for detecting host anomalies,but the number of anomalies is relatively small,which makes the collected system call data often have the problem of data imbalance.The lack of abnormal system call data makes the detection model unable to fully understand the abnormal behavior pattern of the program,which leads to low accuracy and high false positive rate of intrusion detection.To solve the above problems,a system call host intrusion detection method based on generative adversarial network is proposed.By enhancing abnormal system call data,the problem of data imbalance is alleviated.Firstly,the system call trace of the program is divided into fixed length N-Gram sequences.Secondly,SeqGAN is used to generate synthetic N-Gram sequences from the N-Gram sequences of abnormal data.The generated abnormal data is combined with the original dataset to train the intrusion detection model.Experiments are carried out on a host system call dataset ADFA-LD and an Android system call dataset Drebin.The detection accuracy rate is 0.986 and 0.989,and the false positive rates is 0.011 and 0,respectively.Compared with the existing intrusion detection research methods based on hybrid neural network model,WaveNet,Relaxed-SVM and RNN-VED,the detection performance of the proposed method is better than other methods.
作者
樊燚
胡涛
伊鹏
FAN Yi;HU Tao;YI Peng(Information Technology Institute,Information Engineering University,Zhengzhou 450002,China)
出处
《计算机科学》
CSCD
北大核心
2024年第10期408-415,共8页
Computer Science
基金
河南省重大科技专项(221100240100)
郑州市重大科技创新专项(2021KJZX0060-3)。
关键词
主机入侵检测
系统调用
生成对抗网络
深度学习
数据不均衡
Host intrusion detection
System call
Generative adversarial network
Deep learning
Data imbalance