期刊文献+

基于C4.5决策树的嵌入型恶意代码检测方法 被引量:7

Detection of Embedded Malware Based on C4.5 Decision Tree
下载PDF
导出
摘要 嵌入型恶意代码以其高隐蔽性和难检测性,成为计算机安全的新威胁.文中针对以往的统计分析法没有充分考虑嵌入型恶意代码所占字节数小、信息增益大的特点提出一种采用C4.5决策树的嵌入型恶意代码检测方法,即通过提取训练样本中信息增益最大的500个3-gram作为属性特征,建立决策树,实现对未知嵌入型恶意代码的检测.实验结果表明,文中方法在检测率和分类准确率上均具有明显优势,对感染了嵌入型恶意代码的Word文档的检测率达99.80%. Embedded malware has become a novel computer security threat due to its high concealment and poor detectability.However,the existing statistical analysis methods are ineffective because they do not fully consider the small number of malicious bytes and the high information gain of embedded malware.In order to solve this problem,a new detection method of embedded malware is proposed based on C4.5 decision tree,which implements the detection by establishing a decision tree with 500 high-information-gain 3-grams extracted from training samples as the attribute.Experimental results show that the proposed method is superior to the existing methods in terms of detection rate and classification accuracy,and that it may achieve a detection rate of 99.80% for infected Word.
出处 《华南理工大学学报(自然科学版)》 EI CAS CSCD 北大核心 2011年第5期68-72,共5页 Journal of South China University of Technology(Natural Science Edition)
基金 国家技术创新基金资助项目(08C26214411198) 粤港关键领域重点突破项目(2008A011400010)
关键词 嵌入型恶意代码 恶意代码检测 C4.5决策树 BOOSTING算法 embedded malware malware detection C4.5 decision tree Boosting algorithm
  • 相关文献

参考文献16

  • 1Stolfo S J,Wang K,Li W J.Towards stealthy malware detection[M] // Malware detection.Heidelberg:SpringerVerlag,2007:231-249.
  • 2Li W J,Stoffo S J,Stavrou A,et al.A study of malcodebearing documents[C] //Proceedings of the 4th International Conference on Detection of Intrusions and Malware,and Vulnerability Assessment.Heidelberg:Springer-Verlng,2007:231-250.
  • 3Shafiq M Z,Khayam S A,Farooq M.Embedded malware detection using Markov n-grams[C] //Proceedings of the 5th International Conference on Detection of Intrusions and Malware,and Vulnerability Assessment.Heidelberg:Springer-Verlag,2008:88-107.
  • 4John Leyden.Trojan exploits unpatched Word vulnerability[EB/OL].(2006-05-22)[2010-05-28].http://www.theregister.co.uk/2006/05/22/trojan_ exploit_word_vuln/.
  • 5Joris Evers.Zero-day attacks continue to hit Microsoft[EB/OL].(2006-09-28)[2010-05-28].http://news.cnet.com/ Zero-day-attacks-continue-to-hit-Microsoft/2100-7349_3-6120481.html.
  • 6David Kierznowski.Backdooring PDF files[EB/OL].(2006-09-13)[2010-05-28].http:// michaeldaw.org/md-hacks/backdooring-pdf-files.
  • 7Damashek M.Gauging similarity with n-grams:language-independent categorization of text[J].Science,1995,267(5199):843-848.
  • 8Grossman D A,Frieder O.Information retrieval:algorithms and heuristics[M].2nd ed.Heidelberg:Springer-Verlag,2004.
  • 9Dumais S,Platt J,Heckerman D,et al.Inductive learning algorithms and representations for text categorization[C] // Proceedings of the 7th International Conference on Information and Knowledge Management.New York:ACM Press,1998:148-155.
  • 10Kolter J Z,Maloof M A.Learning to detect malicious executables in the wild[C] // Proceedings of the International Conference on Knowledge Discovery and Data Mining.New York:ACM Press,2004:470-478.

二级参考文献27

  • 1许孝元,韩国强,闵华清.预测型关联规则演化学习的适应值函数[J].华南理工大学学报(自然科学版),2005,33(5):1-6. 被引量:3
  • 2Moore AW, Zuev D. Internet traffic classification using Bayesian analysis techniques. In: Proc. of the 2005 ACM SIGMETRICS Int'l Conf. on Measurement and Modeling of Computer Systems, Banff, 2005. 50-60. http://www.cl.cam.ac.uk/-awm22 /publications/moore2005internet.pdf.
  • 3Madhukar A, Williamson C. A longitudinal study of P2P traffic classification. In: Proc. of the 14th IEEE Int'l Syrup. on Modeling, Analysis, and Simulation. Monterey, 2006. http://ieeexplore.ieee.org/xpl/ffeeabs_all.jsp?arnumber=1698549.
  • 4Moore AW, Papagiannaki K. Toward the accurate identification of network applications. In: Dovrolis C, ed. Proc. of the PAM 2005. LNCS 3431, Heidelberg: Springer-Verlag, 2005.41-54.
  • 5Karagiannis T, Papagiannaki K, Faloutsos M. BLINC: Multilevel traffic classification in the dark. In: Proc. of the ACM SIGCOMM. Philadelphia, 2005. 229-240. http://conferences.sigcomm.org/sigcomm/2005/paper-KarPap.pdf.
  • 6Roughan M, Sen S, Spatscheck O, Dutfield N. Class-of-Service mapping for QoS: A statistical signature-based approach to IP traffic classification. In: Proc. of the ACM SIGCOMM Internet Measurement Conf. Taormina, 2004. 135-148. http://www.imconf.net/imc-2004/papers/p 135-roughan.pdf.
  • 7Zuev D, Moore AW. Traffic classification using a statistical approach. In: Dovrolis C, ed. Proc. of the PAM 2005. LNCS 3431, Heidelberg: Springer-Verlag, 2005. 321-324.
  • 8Nguyen T, Armitage G. Training on multiple sub-flows to optimise the use of Machine Learning classifiers in real-world IP networks. In: Proc. of the 31 st IEEE LCN 2006. Tampa, 2006. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4116573.
  • 9Eerman J, Mahanti A, Arlitt M. Internct traffic identification using machine learning techniques. In: Proc. of the 49th IEEE GLOBECOM. San Francisco, 2006. http://pages.cpsc.ucalgary.ca/-mahanti/papers/globecom06.pdf.
  • 10Erman J, Arlitt M, Mahanti A. Traffic classification using clustering algorithms. In: Proc. of the ACM SIGCOMM Workshop on Mining Network Data (MineNet). Pisa, 2006. http://conferences.sigcomm.org/sigcomm/2006/papers/minenet-01.pdf.

共引文献168

同被引文献53

引证文献7

二级引证文献25

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部