摘要
针对网页篡改问题,设计了一种基于机器学习的批量网页篡改检测方法.以一所综合性大学所有注册网站为研究对象,通过抓取网站首页面的所有信息,对抓取数据进行分类建立对应的检测规则,综合判断网页是否存在篡改.该方法分为学习阶段和检测阶段,学习阶段根据网页历史信息获取各个检测器的标准值,检测阶段对待检测网页的各个参数进行检测,综合多个检测器的输出,反馈检测结果,若结果为误报,则系统进行重新训练修正参数.以实际发生的网页篡改案例为依据,进行网页篡改模拟,并对误报率和漏报率进行了分析,结果表明:当检测数据集窗口大小为11,报警阈值为2时,误报率为1.183%,漏报率为0.878%,获得了最优的效果.
A numerous web pages tamper detection method was designed to cope with page tamper problem.All the registered websites of a comprehensive university were studied,and all the data in home pages were crawled and classified,corresponding detection rules were built,and an overall judgment was given for each pages.The proposed method included learning and detecting phase.Each detector standard value was trained from web pages history information in learning phase.In detecting phase,each parameter was detected,the detectors′output were gathered,the results were shown,and the website administrator was notified to confirm immediately if a webpage was detected be tampered with,and the system retrained to modify parameters when it was a false positive.Tampering simulates experimental results show that when the window size is 11 and the alarm threshold value is 2,the false positive rate is 1.183%and the false negative rate is 0.878%,so the optimal results were obtained.
作者
赖清楠
陈诗洋
马皓
张蓓
Lai Qingnan Chen Shiyang Ma Hao Zhang Bei(Computer Center School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China)
出处
《华中科技大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2016年第11期16-20,共5页
Journal of Huazhong University of Science and Technology(Natural Science Edition)
基金
国家高技术研究发展计划资助项目(2015AA01A202)
关键词
机器学习
网页防篡改
篡改检测
网络安全
系统设计
machine learning
page tamper-resistant
tamper detection
network security
system design