期刊文献+

深度自编码器在数据异常检测中的应用研究 被引量:5

Application Research of Deep Auto Encoder in Data Anomaly Detection
下载PDF
导出
摘要 针对自编码器网络(AE)需要正常数据进行训练的局限性,结合主成分分析方法,将AE的每次重建输出与输入数据进行求差,隔离出异常数据部分,即将输入数据分为正常与异常部分,正常部分由AE重建输出,异常部分由近端法进行优化输出,最后采用交替方向乘子法训练整个模型并达到预定训练次数再输出结果,实现了一种基于深度自编码网络(DAE)模型的无监督数据异常检测方法。在7个真实数据集与8种机器学习模型和AE模型进行了对比实验,结果表明,DAE模型无需输入正常数据就可以有效进行模型训练,且可以防止模型的过拟合,其综合表现高于传统机器学习模型和AE模型,AUC值在4个数据集中达到最优。在mnist数据集中,DAE模型的AUC值相比于孤立森林(IF)方法提高了10.93%。 Normal data for training is usually required in Auto Encoder(AE)network,which limits its applications in data anomaly detection.This paper proposes an unsupervised data anomaly detection method based on a Deep Auto Encoder(DAE)network model.In this model,Principal Components Analysis(PCA)is introduced,and the anomaly data is isolated by differencing each reconstruction output of AE and the input data.That is,the input data is divided into normal data and anomaly data,where the normal data is reconstructed via the AE network,and the anomaly data is optimized before outputting.Finally,the whole model is trained by the Alternating Direction Method of Multipliers(ADMM),and the results are outputted when the predetermined number of training times is sucessfully achieved.The DAE model is compared with eight machine learning models and AE model in seven real datasets.The results show that the DAE model can effectively carry out model training without inputting normal data and prevent model from overfitting,and the overall performances are better than those using the traditional machine learning model and AE model.The AUC values of DAE model are optimal in 4 datasets,among which,the AUC value obtained from the DAE model is 10.93%higher than that from the Isolated Forest(IF)method in the mnist datasets.
作者 张常华 周雄图 张永爱 姚剑敏 郭太良 严群 ZHANG Changhua;ZHOU Xiongtu;ZHANG Yong’ai;YAO Jianmin;GUO Tailiang;YAN Qun(College of Physics and Information Engineering,Fuzhou University,Fuzhou 350108,China;RichSense Electronic Technology Co.,Ltd.,Jinjiang,Fujian 362200,China)
出处 《计算机工程与应用》 CSCD 北大核心 2020年第17期93-99,共7页 Computer Engineering and Applications
基金 国家自然科学基金(No.61775038) 福建省自然科学基金(No.2017J01758,No.2019J01221) 福建省高等学校新世纪优秀人才支持计划。
关键词 数据异常检测 自编码网络 深度自编码网络 曲线下面积(AUC) data anomaly detection auto encoder network DeepAuto Encoder network(DAE) Area Under the Curve(AUC)
  • 相关文献

参考文献8

二级参考文献49

  • 1杨宜东,孙志挥,朱玉全,杨明,张柏礼.基于动态网格的数据流离群点快速检测算法[J].软件学报,2006,17(8):1796-1803. 被引量:22
  • 2Hanemann A, Sailer M, Sehmitz D. Towards a framework for it service fault management. Proceedings of the European University Information Systems Conference (EUNIS2005), Manehester, England, 2010.
  • 3Steindler M, Sethi A S. Probabilities fault diagnosis in communication systems through incremental hypothesis updating. Computer Networks, 2011, 45(4):537-562.
  • 4BoxEP,JenkinsGM,ReinselGC.时间序列分析-预测与控制.顾岚,范金减译.北京:中国统计出版社,2011.
  • 5Basu S, Mukherjee A, Klivansky S. Time series models for intemet traffic, http://hdl.handle.net/1853/6696,1996.
  • 6Frost V, Melamed B. Traffic modeling for telecommunications networks. IEEE Communication Magazine, 2004, 32(3):70-81.
  • 7Corinna Cortes,Vladimir Vapnik.Support-Vector Networks[J]. Machine Learning . 1995 (3)
  • 8S. Zander,T. Nguyen,G. Armitage.Automated traffc classiffication and applicationidentification using machine learning. Proceedings of the IEEE30th Conference onLocal Computer Networks (LCN2005) . 2005
  • 9L. M. Manevitz,and M. Yousef.One-Class SVMs for Document Classification. Journal of Machine Learning Research . 2001
  • 10Madhukar A,Williamson C.A Longitudinal Study of P2P Traffic Classification. Modeling, Analysis, and Simulation of Computer and Telecommunication Systems 14th IEEE International Symposium . 2006

共引文献246

同被引文献49

引证文献5

二级引证文献14

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部