期刊文献+

基于Hadoop的海量安全日志聚类算法研究 被引量:6

Research on Hadoop-based Massive Security Log Clustering Algorithm
下载PDF
导出
摘要 大数据环境下,网络安全事件层出不穷,网络安全成为各界关注的热点。安全日志记录着设备运行状态的重要信息,通过对其分析可以实时掌握网络安全态势,可作为事前防护、事后追责的安全审计手段,实现对异常事件的追责与溯源。针对日志审计的重要性并结合数据挖掘在日志分析领域的重要作用,同时针对单机环境下处理海量数据效率相对滞后等问题,文章提出一种基于Hadoop的面向海量安全日志的聚类算法。首先,文章提出了基于最大最小距离(MMD)和均值思想对K-means聚类算法进行改进,克服了传统K-means聚类算法在寻找初始聚类中心随机性的缺陷;其次,为了适应海量数据的有效处理,提高聚类的效率与速度,将改进的K-means聚类算法部署在Map/Reduce上进行迭代计算。实验表明,改进的聚类算法的准确性优于其他典型算法,聚类效果稳定,在集群的性能上具有较好的运行速度和加速比。 In the big data environment, network security incidents emerge one after another, and network security has become a hot spot of concern. As a dark data in the new environment, the security log records the important information of the running status of the equipment. Through its analysis, it can grasp the network security situation in real time, and can be used as a security auditing tool for pre-protection and after-accusation, to achieve abnormal events. Aiming at the importance of log auditing and combining the important role of data mining in the field of log analysis, and aiming at the relative lag of processing massivedata in a single machine environment, a clustering algorithm based on Hadoop for massive security log is proposed. Firstly, the ^-means clustering algorithm is improved based on the maximum and minimum distance (MMD) and the mean value, which overcomes the defect of the traditional ^-means algorithm in finding the randomness of the initial cluster center. Secondly, in order to adapt to the massive data. Effectively process, improve the efficiency and speed of clustering, and deploy the improved ^-means clustering algorithm on Map/ Reduce for iterative calculation. Experiments show that the improved clustering algorithm proposed in this paper is better than other typical methods, and the clustering effect is stable. It has better running speed and speedup ratio in cluster performance.
作者 陆勰 罗守山 张玉梅 LU Xie;LUO Shoushan;ZHANG Yumei(Information Security Center,Beijing University of Posts andTelecommunications,Beijing 100816,China;Chinese People's Armed Police Force Corps of Tianjin,Tianjin 300001,China)
出处 《信息网络安全》 CSCD 北大核心 2018年第8期56-63,共8页 Netinfo Security
基金 国家高技术研究发展计划(863计划)[2015AA016005]
关键词 安全日志 聚类 尤-means MAP/REDUCE HADOOP security log clustering K-means Map/Reduce Hadoop
  • 相关文献

参考文献14

二级参考文献120

共引文献1451

同被引文献56

引证文献6

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部