基于机器学习的网页恶意代码检测方法被引量：5

Malicious Web Pages Detection Based on Machine Learning

下载PDF

导出

摘要网络中大量的恶意网页已经成为网络用户的主要安全威胁。本文提出了一种基于机器学习分类器的网页恶意JavaScript代码分析方法。通过对训练样本训练学习,建立分类模型,最后对测试样本检测。实验表明,本方法能够有效的检测出大部分恶意网页JavaScript代码,检测准确率达到88.5% A large number of malicious website in the network has become a major security threat of network users. This paper puts forward a kind of malicious JavaScript code analysis method based on machine learning classifier. Through the study of the training sample training, establish the classification model, finally, the detectin is tested on soonples. The experimental results show that this method can more effectively detect the most malicious JavaScript code, accuracy up to 88.5 %.

作者李洋刘飚封化民

机构地区西安电子科技大学通信工程学院北京电子科技学院

出处《北京电子科技学院学报》 2012年第4期36-40,12,共6页 Journal of Beijing Electronic Science And Technology Institute

基金国家自然科学基金项目"基于多模态特征的多媒体语义分析关键理论与技术研究(NO.60972139)" 北京市自然科学基金项目"基于网络多媒体信息语义的网络舆情分析研究(NO.4092041)"的资助

关键词恶意网页代码 JAVASCRIPT 特征提取 malicious web page javascript feature extraction

分类号 TP393.081 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献12

1http ://user. qzone, qq. com/95007917/blog/1274004740.
2M. Johns. On javascript malware and related threats[C]. Computer Virology, Jan 2008.
3黄建军,梁彬.基于植入特征的网页恶意代码检测[J].清华大学学报（自然科学版）,2009(S2):2208-2214. 被引量：5
4Egele. M, E. Kirda, and C. Kruegel. Defending browsers against drive-by downloads: Mitigating heap-spra- ying code injection attacks. Detection of Intrusions and Malware, Jan 2009.
5Hallaraker. O and G. Vigna. Detecting malicious javascript code in mozilla. Engineering of Complex Com- puter Systems, Jan 2005.
6Reis C,Dunagany J,Wang H J, et al. BrowserShield: Vulnerability-driven filtering of dynamic HTML[J]. ACM Transactions on the Web,2007,3(1) :11.
7Seifert, I. Welch, and P. Komisarczuk. Identification of malicious web pages with static heuristics[C]. In Australasian Telecommunication Networks and Applications Conference, Jan 2008.
8Craioveanu. Server-side polymorphism: Techniques of analysis and defense. [C] In 3rd International Confer- ence on Malicious and Unwanted Software,2008.
9http: //www. alexa, com/topsites.
10Google, Inc. Google safe browsing API. http://code, google, com/apis/ safebrowsing /.

二级参考文献12

1Honeypot.. http://en.wikipedia.org/wiki/Honeypot_ (computing) . 2009
2Capture-HPC.. https://projects.honeynet.org/capture-hpc/ . 2009
3Wang Y,Beck D,Jiang X,et al.Automated web patrol withstrider Honey Monkeys:Finding web sites that exploitbrowser vulnerabilities. Proc the 13th Network andDistributed System Security Symposium (NDSS 2006) . 2006
4Provos N,McNamee D,Mavrommatis P,et al.The ghost inthe browser:analysis of web-based mal ware. Proc FirstWorkshop on Hot Topics in Understanding Botnets . 2007
5Moshchuk A,Bragin T,Gribble S D,et al.Acrawler-basedstudy of spyware on the web. Proc the 13th Networkand Distributed Systems Security Symposium (NDSS 2006) . 2006
6Seifert C,Welch I,Komisarczuk P.HoneyC—Thelow-interaction client Honeypot. Proc the 5th NewZealand Computer Science Research Student Conference (NZCSRSC 07) . 2007
7Roesch,M.Snort—lightweight intrusion detection fornetworks. Proc the 13th Large Systems AdministrationConference . 1999
8Trends in badware 2007.. http://www.stopbadware.org/home/trends2007 . 2009
9HTML Parser. http://ht mlparser.sourceforge.net/ . 2009
10Bergman M.The‘Deep’web:surfacing hidden value.. http://brightplanet.com/white-papers/119.ht ml?task=view . 2009

共引文献4

1赵莉,凌翔.网页恶意代码检测系统研究[J].电子设计工程,2015,23(5):25-27.
2朴杨鹤然,任俊玲.基于Stacking的恶意网页集成检测方法[J].计算机应用,2019,39(4):1081-1088. 被引量：7
3黄子依,秦玉海.基于多特征识别的恶意挖矿网页检测及其取证研究[J].信息网络安全,2021(7):87-94. 被引量：5
4刘武,李风华,段海新,孙东红,王继龙,吴海燕,刘沐,张龙.校园网挖矿行为自动识别处置系统[J].福州大学学报（自然科学版）,2023,51(5):657-661.

同被引文献25

1苏贵洋,李建华,马颖华,李生红.用于中文色情文本过滤的近邻法构造算法[J].上海交通大学学报,2004,38(z1):76-79. 被引量：6
2宋江春,沈钧毅.一种新的Web用户群体和URL聚类算法的研究[J].控制与决策,2007,22(3):284-288. 被引量：11
3恶意网站实验室[EB/OL] ? http://www. mwsl. org.cn/,2015-05-11.
4Braun B, Johns M, Koestler J. PhishSafe: Leveragingmodem JavaScript API’s for transparent and robustprotection[ EB/OL]. http://web, sec. uni-passau. de/papers/2014 - Braun - Koestler _Johns_Pose^a-PhishSafe_Leveraging_Modem _ JavaScript _ APIs _for_Transparent 一and_Robust_Protection. pdf ,2015-04-18.
5Urvoy T,Chauveau E,Filoche P. Tracking web spam withHTML style similarities [ J ]. TWEB ,2008,2 (1) : 1 -28.
6Apache. Hadoop information [ EB/OL]. http://hadoop. apache, org/,2015-05-11.
7Dean J, Ghemawat S. MapReduce; Simplified dataprocessing on large clusters [ EB/OL ]. http://citeseerx. ist. psu. edu/viewdoc/summary? doi = 10. 1.1.135.4448&or=7,2015-04-18.
8Akoglu L, Mcglohon M, Faloutsos C. OddBall: Spottinganomalies in weighted graphs [ EB/OL ]. http://citeseerx. ist. psu. edu/viewdoc/summary? doi= 10. 1.1.168.6324,2015-04-18.
9Ma J, Saul L K, Savage S, et al. Beyond blacklists:Learning to detect malicious web sites from suspiciousURLs[ EB/OL] _ http://citeseerx. ist. psu. edu/viewdoc/summary? doi = 10.1.1.153.3276,2015—04—18.
10Ma J, Saul L K, Savage S. Identifying suspiciousURLs : An application of large-scale online learning[EB/OL]. http://citeseerx. ist. psu. edu/viewdoc/summary? doi= 10.1.1.153. 3318 ,2015—04—18.

引证文献5

1陈庄,刘龙飞.融合域名注册信息的恶意网站检测方法研究[J].计算机光盘软件与应用,2015,18(1):121-122.
2张瀚珑,沈备军,王永剑.基于模板检测的违法网站识别方法[J].南京理工大学学报,2015,39(3):266-271. 被引量：5
3杨洪娇.基于机器学习的校园网恶意网页检测方法[J].信息与电脑,2016,28(11):175-176. 被引量：2
4凡友荣,杨涛,王永剑,姜国庆.基于URL特征检测的违法网站识别方法[J].计算机工程,2018,44(3):171-177. 被引量：8
5刘天一,张汝娴,袁艺,邢韦川,林清然,周延森.基于机器学习的网站识别方案[J].网络安全技术与应用,2020(7):62-63. 被引量：1

二级引证文献13

1张莉,孙丽娜,郭峰.在线社会网络中近似网页识别方法研究[J].微电子学与计算机,2017,34(2):141-144.
2凡友荣,杨涛,王永剑,姜国庆.基于URL特征检测的违法网站识别方法[J].计算机工程,2018,44(3):171-177. 被引量：8
3薛宛玥,洪磊,陈维杰,程欣.基于PageRank算法的赌博网站静态检测技术改进研究[J].现代计算机,2020,26(2):3-7. 被引量：3
4梅莹莹.机器学习在校园安全中的应用研究[J].山东农业工程学院学报,2020,37(4):43-45.
5刘天一,张汝娴,袁艺,邢韦川,林清然,周延森.基于机器学习的网站识别方案[J].网络安全技术与应用,2020(7):62-63. 被引量：1
6李柯言,刘晓东.基于特征识别的网页篡改检测系统[J].电子设计工程,2020,28(18):16-19. 被引量：2
7汪俊明,俞诗博,李素云.基于字符卷积神经网络的违法URL识别[J].电脑知识与技术,2021,17(11):181-184.
8李国静,尹天阳,张兴睿.基于PAM概率主题模型的赌博网站检测方法[J].计算机应用与软件,2021,38(9):167-172. 被引量：4
9李枭.网络赌博犯罪环节及治理对策研究[J].北京警察学院学报,2021(5):110-115. 被引量：5
10刘涛,李思鉴,何智帆,周宇,姚兴博.基于gcForest算法的恶意URL检测[J].机电信息,2022(23):11-15.

1小军.如何对付恶意网页代码[J].广东电脑与电讯,2003(02M):85-85.
2金山毒霸大话病毒[J].大众软件,2005(21):25-25.
3温传伟.杜绝恶意网页代码解除注册表连环套[J].网迷,2002(5):26-27.
4探索“流氓软件”发展史[J].计算机与网络,2006,32(22):34-35.
5王诗灏.系统工具与计算机安全[J].中国科技博览,2010(10):90-90.
6恶意软件危害互联网立法将势在必行[J].数码时代,2007(2):11-11.
7阻止病毒木马侵害您电脑的五种措施[J].计算机与网络,2010,36(6):25-25.
8乔珊.防范恶意网页代码有妙招[J].网管员世界,2005(11):45-45.
9Jiajia Zhao,ZhengyuanTang,Jie Yang,Erqi Liu.Infrared small target detection using sparse representation[J].Journal of Systems Engineering and Electronics,2011,22(6):897-904. 被引量：11
10网站安全事件引发思考[J].网络运维与管理,2013(6):95-95.

北京电子科技学院学报

2012年第4期

浏览历史

内容加载中请稍等...

基于机器学习的网页恶意代码检测方法被引量：5

参考文献12

二级参考文献12

共引文献4

同被引文献25

引证文献5

二级引证文献13

相关作者

相关机构

相关主题

浏览历史

基于机器学习的网页恶意代码检测方法 被引量：5

参考文献12

二级参考文献12

共引文献4

同被引文献25

引证文献5

二级引证文献13

相关作者

相关机构

相关主题

浏览历史

基于机器学习的网页恶意代码检测方法被引量：5