摘要
描述了一种基于多特征的恶意代码家族静态标注方法,该方法针对现有技术提取特征单一的缺点,采用恶意代码可视化技术绘制恶意代码图像,并从图像源和文本源、字节码层和操作码层进行特征的提取,多来源多层次地提取特征.为了更好地利用提取自多个层次的特征,设计了3层多分类器联合框架来进行特征的学习,3层多分类器联合框架分为特征组合层、分类层和联合层.最后利用学习到的模型便可以自动进行恶意代码的标注.为了验证方法的有效性,对Microsoft提供的9类恶意代码进行恶意代码家族标注测试实验,实验结果表明,该方法在除了Simda恶意样本家族外,在其他样本家族中的准确率、精确率、召回率和F1-score均高于90%.通过实验证明了该方法的有效性和可靠性.
This paper describes a method of static tagging of malicious code family based on multiple features,it uses malicious code visualization technology to draw malicious code image, extracts feature from image source and text source,byte code layer and operation code layer,it extract features from multiple sources'and multi-level which aims at overcoming defects that only extract features from one source.In order to make better use of the features extracted from multiple levels,this paper designs a 3-ldyer multi-classifier joint framework for feature learning, and the 3-layer multi-classifier joint framework is divided into three parts,which are feature combination layer;classification lancer'and union layer.Finally,we can use the learning model to tag the malicious code automatically.In order to verify the validity of the method,we made the malicious code family tagging test experiment with 9 kinds of malicious code in Microsoft's data set, and the experimental results show that our method has higher accuracy,precision,recall and Fl-score which are more than 90% in other sample families except Simda malicious code family. The validity and reliability of the method are proved by experiments.
作者
刘亮
刘露平
何帅
刘嘉勇
Liu Liang;Liu Luping;He Shuai;Liu Jiayong(College of Cybersecurity,Sichuan University,Chengdu 610065;College of Electronics and Information Engineering,Sichuan University,Chengdu 610065)
出处
《信息安全研究》
2018年第4期322-328,共7页
Journal of Information Security Research
基金
CCF-启明星辰鸿雁科研计划基金项目(CCF-VenustechR2017002)
关键词
恶意代码家族
多特征
恶意代码图像
机器学习
多分类器联合框架
malicious code family
malicious code image
machine learning
multi-feature
multiclassifier joint framework