期刊文献+

基于机器学习的批量网页篡改检测方法 被引量:3

Tamper detection of numerous web pages based on machine learning
原文传递
导出
摘要 针对网页篡改问题,设计了一种基于机器学习的批量网页篡改检测方法.以一所综合性大学所有注册网站为研究对象,通过抓取网站首页面的所有信息,对抓取数据进行分类建立对应的检测规则,综合判断网页是否存在篡改.该方法分为学习阶段和检测阶段,学习阶段根据网页历史信息获取各个检测器的标准值,检测阶段对待检测网页的各个参数进行检测,综合多个检测器的输出,反馈检测结果,若结果为误报,则系统进行重新训练修正参数.以实际发生的网页篡改案例为依据,进行网页篡改模拟,并对误报率和漏报率进行了分析,结果表明:当检测数据集窗口大小为11,报警阈值为2时,误报率为1.183%,漏报率为0.878%,获得了最优的效果. A numerous web pages tamper detection method was designed to cope with page tamper problem.All the registered websites of a comprehensive university were studied,and all the data in home pages were crawled and classified,corresponding detection rules were built,and an overall judgment was given for each pages.The proposed method included learning and detecting phase.Each detector standard value was trained from web pages history information in learning phase.In detecting phase,each parameter was detected,the detectors′output were gathered,the results were shown,and the website administrator was notified to confirm immediately if a webpage was detected be tampered with,and the system retrained to modify parameters when it was a false positive.Tampering simulates experimental results show that when the window size is 11 and the alarm threshold value is 2,the false positive rate is 1.183%and the false negative rate is 0.878%,so the optimal results were obtained.
作者 赖清楠 陈诗洋 马皓 张蓓 Lai Qingnan Chen Shiyang Ma Hao Zhang Bei(Computer Center School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China)
出处 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2016年第11期16-20,共5页 Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金 国家高技术研究发展计划资助项目(2015AA01A202)
关键词 机器学习 网页防篡改 篡改检测 网络安全 系统设计 machine learning page tamper-resistant tamper detection network security system design
  • 相关文献

参考文献6

二级参考文献106

  • 1张建华,李涛,刘晓洁,徐春林,张楠.Web页面加密存储及访问机制[J].计算机工程,2004,30(13):97-98. 被引量:6
  • 2程冲,黄水清.利用正则表达式解析新闻网页的算法研究[J].农业图书情报学刊,2005,17(4):5-8. 被引量:7
  • 3Hrvoje Niksic. GNU Wget[EB/OL]. (2010-08-18). http://www. gnu. org/software/wget/wget, html.
  • 4Network Working Group. Hypertext Transfer Protocol HTTP/ 1.1 [EB/OL]. (1999- 06- 15). http://www, ietf. org/rfc/ rfc2616, txt.
  • 5Network Working Group. Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies[EB/OL]. ( 996-11-12). http://www. ietf. org/rfc/rfe2045, txt.
  • 6Network Working Group. The MD5 Message--Digest Algorithm [EB/OL]. (1999-04-06). http://www, ietf. org/rfc/rfcl321, txt.
  • 7Hrvoje Niksic. GNU Wget 1. 12 Manual[EB/OL]. (2010-09-03). http://www, gnu. org/software/wget/manual/wget, html.
  • 8Martijn Koster. The Web Robots Pages[EB/OL]. (1996-12-04). http ://www. robotstxt, org/norobots-rfc, txt.
  • 9John Ferguson Smart. An introduction to Maven 2 [EB/OL]. ( 2005-12-05 ). http://www, javaworld, eom/javaworld/jw 12- 2005/jw-1205 maven, html.
  • 10Amit Klein. Dom Based Cross Site Scripting or XSS of the third Kind[EB/OL]. ( 2005-04-07 ). http://www, webappsec, org/pro jeers/articles/071105, html.

共引文献28

同被引文献27

引证文献3

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部