期刊文献+

对于大规模系统日志的日志模式提炼算法的优化 被引量:6

Optimization of the log pattern extraction algorithm for large-scale syslog files
下载PDF
导出
摘要 LARGE框架是部署在中国科学院超级计算环境中的日志分析系统,通过日志收集、集中分析、结果反馈等步骤对环境中的各种日志文件进行监控和分析。在对环境中系统日志的监控过程中,系统维护人员需要通过日志模式提炼算法将大量的过往系统日志记录缩减为少量的日志模式集合。然而随着日志规模的增长以及messages日志文件的特殊性,原有的日志模式提炼算法已经难以满足对大规模日志快速处理的需要。介绍了一种对于日志模式提炼算法的优化方法,通过引入MapReduce机制实现在存在多个日志输入文件的情况下对日志处理和模式提炼的流程进行加速。实验表明,当输入文件较多时,该优化方法能够显著提高词汇一致率算法的运行速度,大幅减少运行时间。此外,还对使用词汇转换函数时的算法运行时间和提炼效果进行了验证。 The LARGE system is a log analysis framework deployed in the supercomputing environment in Chinese Academy of Sciences. It monitors and analyzes various log files in the environment through log collection, centrally analysis and result feedback. In the process of monitoring system logs, it is necessary for system maintenance personnel to reduce the large number of original logs into a small set of log patterns using the log pattern extraction algorithm. However, because of the fast increase of log size and the peculiarity of messages log files, the traditional log pattern extraction algorithm fails to satisfy the requirement of rapid processing of logs. We propose an optimization method for the log pattern extraction algorithm by introducing the idea of the MapReduce mechanism to accelerate the process of log pattern extraction in case of multiple input log files. Evaluation results show that when there are a number of input files, the optimization method can significantly improve the running speed of the vocabulary consistency algorithm and greatly reduce the running time. We also evaluate the time cost and the extraction effect the optimization algorithm when the vocabulary conversion function is used.
出处 《计算机工程与科学》 CSCD 北大核心 2017年第5期821-828,共8页 Computer Engineering & Science
基金 国家重点研发计划项目(2016YFB0201404) 十二五863重大项目(2014AA01A302)
关键词 日志处理 MapReduce机制 大数据分析 网格环境 log processing MapReduce big-data analysis grid environment
  • 相关文献

参考文献2

二级参考文献20

  • 1储久良.基于SNMP应用服务器的监控平台构建与实现[J].实验室研究与探索,2010,29(1):65-68. 被引量:12
  • 2Nagios项目组.Nagios-3应用指南[S],http://www.nagios.org,2008.
  • 3Jin H.ChinaGrid: Making Grid Computing a Reality. Digital Libraries: International Collaboration and Cross-Fertilization . 2005
  • 4Youchan Zhu,Pengfei Shen.Bring AJAX to web application based on grid service. 2009 International Conference on Intelligent Human-Machine Systems and Cybernetics, IHMSC 2009 . 2009
  • 5C. Catlett,W.E. Allcock,P. Andrews,R. Aydt,R. Bair,N. Balac,B. Banister,T. Barker,M. Bartelt,P. Beckman, and others."Teragrid:Analysis of organization, system architecture, and middleware enabling new types of applications,". HPC and Grids in Action . 2007
  • 6Foster I,Kesselman C.The Grid 2: Blueprint for a New Computing Infrastructure. . 2004
  • 7J Zhou,K Li,L Tang.Towards a fully distributed P2P web search engine. Proceedings of the 10th IEEE International Workshop on Future Trends of Distributed Computing Systems (FTDCS'04) . 2004
  • 8Grid computing. http://www.gridcomputing.com/ .
  • 9Cao R Q,Chi X B,Cao Z Y,et al.USGPA:a user-centric and secure grid portal architecture for high-performance computing. IEEE International Symposium on Parallel and Distributed Processing with Application . 2009
  • 10Yang X,Allan R.Bringing AJAX to grid portals. Collaborative Technologies and Systems.2007CTS International Symposium . 2007

共引文献22

同被引文献40

引证文献6

二级引证文献34

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部