基于数据挖掘和机器学习的恶意代码检测方法被引量：12

A Malicious Code Detection Method Based on Data Mining and Machine Learning

下载PDF

导出

摘要近年来,恶意代码采用花指令以及加壳等方法来绕过杀毒软件的检测,而现有的方法对于变种恶意代码无法准确的识别.鉴于恶意代码对计算机安全性的威胁以及恶意代码传播速度快、种类繁多的特点,采用数据挖掘和机器学习的方法对恶意代码进行识别与检测.首先,提出了一种基于数据挖掘和机器学习的恶意代码检测框架,并分别从文本结构层、字节层、代码层3个角度提取了代码特征;然后采用主成分分析的方法对3种层次的组合特征进行特征降维;最后采用不同的分类方法对恶意代码进行识别与分类.分类结果表明:基于组合特征的不同分类方法对恶意代码的识别准确率都在90%以上,能够实现对变种恶意代码的有效检测,为恶意代码查杀提供了一种十分有效的方法,其中决策树分类方法的识别准确率最优. In recent years,malicious code uses flower instructions and packers and other methods to bypass the detection of antivirus software,while the identification of existing methods for variants of malicious code can not be accurate.In the view of threat of malicious code on computer security and features of fast spread and wide variety,this paper uses the data mining and machine learning method to recognize and detect malicious code.Firstly,it proposes a malicious code detection framework based on data mining and machine learning,and extracts the code features from text structure layer,byte layer and code layer respectively.Secondly,it adapts the principal component analysis to reduce the dimension of combined feature matrix Finally,it recognizes and classifies the malicious code using various classification methods.The result shows that the accuracy rate of every classification method based on combined feature matrix is higher than 90%,and among them,the method of decision tree gets the best.It is able to achieve effective detection of variants of malicious code,and provide a very effective method for malware killing to detect the variants of malicious code.

作者廖国辉刘嘉勇

机构地区四川大学电子信息学院

出处《信息安全研究》 2016年第1期74-79,共6页 Journal of Information Security Research

关键词恶意代码多维特征数据挖掘机器学习代码检测 malicious code multidimensional feature data mining machine learning code detection

分类号 TP309 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献1

1黄聪会,陈靖,龚水清,罗樵,朱清超.一种基于危险理论的恶意代码检测方法[J].中南大学学报（自然科学版）,2014,45(9):3055-3060. 被引量：4

二级参考文献19

1Manuel E, Theodoor S, Engin K, et al. A Survey on automated dynamic malware-analysis teclmiques and tools[J]. ACM Computing Surveys, 2012, 4(2): 1-49.
2Ilsun Y, Kangbin Y. Malware obfuscation techniques: A brief survey[C]//Proceedings of the 2010 International Conference on Broadband, Wireless Computing, Communication and Applications. Washington, DC, USA: IEEE, 2010: 297-300.
3Jacob G, Debar H, Fillol E. Behavioral detection of Malware: From a survey towards all established taxonomy[J]. Computer Virology, 2008, 4(3): 251-266.
4Engin K, Christopher K. Behavior-based spyware detection[C]// Proceedings of the 15th conference on USENIX Security Symposium. Berkeley, CA, USA: USENIX Association, 2006: 1-16.
5Mihai C, Somesh J, Christopher K. Mining specifications of malicious behavior[C]//Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering. New York, USA: ACM, 2007: 5-14.
6Mila P, Mihai C, Somesh J, et al. A semantics-based approach to Malware detection[C]// Proceedings of the 34th Annual ACM SIGPLAN-S1GACT Symposium on Principles of Programming Languages. New York, USA: ACM, 2007: 377-388.
7Faraz A, Haider H, M. Zubair S, et al. Using spatio-temporal information in API calls with machine learning algorithms for Malware detection[C]//Proceedings of the 2nd ACM Workshop on Security and Artificial Intelligence. NY, USA: ACM, 2009: 55-62.
8Matzinger E Tolerance, danger and the extended family[J]. Annual Review on Immunology, 1994, 12(4): 991-1045.
9Julie G. The dendritic cell algorithm[D]. Nottingham, UK: the University of Nottingham, 2007:90-100.
10Jean-Marie B, Eric F, Ludovic M. Are current antivims programs able to detect complex metamorphic malware? An empirical evaluation[C]// Proceedings of the 18th EICAR Annual Conference. Prance, 2009: 1-19.

共引文献3

1王栋,杨珂,玄佳兴,贾梓健,廖会敏,王旭仁.基于残差神经网络的木马通信流量分析研究[J].计算机应用研究,2020,37(S02):250-252.
2杨斌,陆余良,杨国正,樊甫华.人工免疫理论在异常检测中的应用进展[J].计算机应用研究,2016,33(4):961-966. 被引量：5
3刘欣.基于危险理论的数字电网网络安全风险预警研究[J].自动化技术与应用,2024,43(2):102-106.

同被引文献33

1李志勇,陶然,王越,张昊.溢出型网页恶意代码运行机理分析与防范[J].兵工学报,2010,31(6):832-836. 被引量：3
2凌晨添.进化神经网络在信用卡欺诈检测中的应用[J].微电子学与计算机,2011,28(10):14-17. 被引量：13
3胥小波,郑康锋,李丹,武斌,杨义先.新的混沌粒子群优化算法[J].通信学报,2012,33(1):24-30. 被引量：125
4王蕊,冯登国,杨轶,苏璞睿.基于语义的恶意代码行为特征提取及检测方法[J].软件学报,2012,23(2):378-393. 被引量：71
5韩晓光,曲武,姚宣霞,郭长友,周芳.基于纹理指纹的恶意代码变种检测方法研究[J].通信学报,2014,35(8):125-136. 被引量：56
6韩莹,李姗姗,陈福明.基于机器学习的地震异常数据挖掘模型[J].计算机仿真,2014,31(11):319-322. 被引量：11
7黄煜坤.网络安全异常检测技术探究[J].电子测试,2015,26(3):40-42. 被引量：1
8黄海新,张路,邓丽.基于数据挖掘的恶意代码检测综述[J].计算机科学,2016,43(7):13-18. 被引量：8
9李彦冬,郝宗波,雷航.卷积神经网络研究综述[J].计算机应用,2016,36(9):2508-2515. 被引量：538
10毛蔚轩,蔡忠闽,童力.一种基于主动学习的恶意代码检测方法[J].软件学报,2017,28(2):384-397. 被引量：26

引证文献12

1芦效峰,蒋方朔,周箫,崔宝江,伊胜伟,沙晶.基于API序列特征和统计特征组合的恶意样本检测框架[J].清华大学学报（自然科学版）,2018,58(5):500-508. 被引量：6
2秦振凯.基于数据挖掘和机器学习方法的网络异常检测技术[J].电子技术与软件工程,2018(22):162-162. 被引量：3
3池亚平,余宇舟,杨建喜.基于深度学习的SDN恶意应用的检测方法[J].计算机工程与设计,2019,40(8):2134-2139. 被引量：8
4胥小波,张文博,何超,罗怡.一种基于行为集成学习的恶意代码检测方法[J].北京邮电大学学报,2019,42(4):89-95. 被引量：8
5池亚平,余宇舟,陈颖.基于切片的深度学习SDN恶意应用程序的检测方法[J].计算机应用与软件,2020,37(1):320-325.
6魏立斐,陈聪聪,张蕾,李梦思,陈玉娇,王勤.机器学习的安全问题及隐私保护[J].计算机研究与发展,2020,57(10):2066-2085. 被引量：26
7张小萍,黄海明.多策略网页恶意代码检测算法的实现[J].太原师范学院学报（自然科学版）,2021,20(1):73-76.
8张小萍.带惯性权重的Jaya算法求解0-1背包问题[J].太原师范学院学报（自然科学版）,2022,21(1):73-76. 被引量：1
9刘晓晨,芦天亮,杨锦璈,杨明.基于1D-CNN-Densenet的恶意代码检测方法[J].中国人民公安大学学报（自然科学版）,2022,28(1):102-108. 被引量：1
10牛俊,马骁骥,陈颖,张歌,何志鹏,侯哲贤,朱笑岩,伍高飞,陈恺,张玉清.机器学习中成员推理攻击和防御研究综述[J].信息安全学报,2022,7(6):1-30. 被引量：1

二级引证文献56

1徐永财.深度学习算法的机电设备工作状态检测[J].电子测量技术,2020,43(11):34-38. 被引量：3
2刘典恩,邵萍.医学思维与哲学思维的结构及其关系刍议[J].医学与哲学,2000,21(5):38-40. 被引量：7
3周琳娜,刘丹.网络信息安全问题及防护策略[J].软件导刊,2019,18(10):166-168. 被引量：11
4王兴凤,黄琨茗,张文杰.基于API序列和卷积神经网络的恶意代码检测[J].信息安全研究,2020,6(3):212-219. 被引量：2
5欧阳元东.基于深度学习技术的恶意APP软件动态检测技术[J].电子技术与软件工程,2020(5):40-41. 被引量：1
6曾瑞江,陈秀萍,都海波,储昭碧.配电自动化设备闭环检测技术研究[J].测控技术,2020,39(5):91-95. 被引量：6
7赵翠镕,张文杰,方勇,刘亮,张磊.基于语义API依赖图的恶意代码检测[J].四川大学学报（自然科学版）,2020,57(3):488-494. 被引量：12
8王宁,王丹,陈怡西,景小荣.基于系统调用的智能终端恶意软件检测框架[J].计算机工程与设计,2020,41(6):1540-1546. 被引量：3
9黄琨茗,张磊,赵奎,刘亮.基于最长频繁序列挖掘的恶意代码检测[J].四川大学学报（自然科学版）,2020,57(4):681-688. 被引量：5
10唐国纯.SDN网络安全架构的研究[J].软件,2020,41(8):10-13.

1陈平,王成耀.基于AST的程序静态分析工具的研究与实现[J].微计算机信息,2007(24):189-190.
2侯敏,刘东升.基于串匹配的源码抄袭检测技术研究[J].电脑编程技巧与维护,2011(24):6-7.
3任浩,史庆庆,张丽萍,刘东升.克隆代码检测方法综述[J].电脑编程技巧与维护,2011(20):19-23. 被引量：4
4叶林,姚国祥.Hadoop集群下的并行克隆代码检测[J].微型机与应用,2014,33(2):69-71.
5陈海燕.浅谈计算机安全性分析及防范措施[J].企业导报,2012(23):295-295.
6杨玉兰.巧用注册表提高计算机安全性的方法[J].电脑学习,2007(6):55-56.
7罗丽霞.状态表决下的计算机数据库更新算法分析[J].网络安全技术与应用,2014(7):20-21. 被引量：1
8邱向群.计算机安全性的几种实现方法[J].电子与电脑,1992(9):2-2.
9于泽君.计算机系统的安全性分析[J].科技风,2017(5):59-60.
10孙文文.浅谈计算机安全性分析与防范措施[J].信息与电脑（理论版）,2013(1):1-2.

信息安全研究

2016年第1期

浏览历史

内容加载中请稍等...

基于数据挖掘和机器学习的恶意代码检测方法被引量：12

参考文献1

二级参考文献19

共引文献3

同被引文献33

引证文献12

二级引证文献56

相关作者

相关机构

相关主题

浏览历史

基于数据挖掘和机器学习的恶意代码检测方法 被引量：12

参考文献1

二级参考文献19

共引文献3

同被引文献33

引证文献12

二级引证文献56

相关作者

相关机构

相关主题

浏览历史

基于数据挖掘和机器学习的恶意代码检测方法被引量：12