摘要
查看网络日志是调查系统故障,监控系统运行状况的重要手段。管理员可以查看在某段时间内所发生的事件,也可以通过对各个日志文件进行分析获取知识。由于日志具有数据量大,不易读懂的特点,如果仅凭借管理员查看日志记录的手段,其中所蕴含的有用信息也难以发现。分布式计算技术正好可以用来解决这一难题。阐述了syslog日志收集流程,详细介绍了Hadoop分布式计算框架,设计并实现了一套基于Hadoop的网络日志分析系统。实验证明该系统是有效而实用的。
Viewing Web log is an important measure to investigate system failures and monitor operating status of the system.Administrators can examine what happened at a certain time,they can also analyze each log file for purpose of acquiring knowledge.If the administrator only review the status by virtue of checking log files,it is difficult to find useful information that contained in those files,as the log files is characterized by large volume of data and uneasy to read.Distributed computing technology can be used to solve this problem exactly.Process of Syslog log collection is explained,details of the Hadoop distributed computing framework is introduced in this paper,and a Hadoop-based network log analysis system is also designed and implemented.Experimental results show that the system is effective and practical.
作者
胡光民
周亮
柯立新
HU Guang-min,ZHOU Liang,KE Li-xin(1.Modern Information and Education Technology Center,Shanghai Ocean University,Shanghai 201306,China;2.College of Information Science,Shanghai Ocean University,Shanghai 201306,China)
出处
《电脑知识与技术》
2010年第8期6163-6164,6185,共3页
Computer Knowledge and Technology