摘要
Web日志的数据预处理是Web日志挖掘过程中基础而关键的一步,对之后的模式识别和模式分析有着很大的影响。为了达到有效处理数据的目的,针对此预处理过程中的5个步骤逐一进行分析,并在事务识别这一步骤中,比较了常用的两种算法。最后,基于这些算法思想,在Windows平台下,采用Java语言实现了Web日志预处理。实验结果表明是有效的。
The data pretreatment of the Web log is the basic and pivotal process in Web log mining. It has a deep influenc on the following pattern recognition and pattern analysis. For getting dispose data in effect, 5 steps of pretreatmen process is analyzed one by one and two common algorithms are compared in the stap of affair recognition. Base on these algorithms, the Web log pretreatment was achieved with Windows platform and JAVA language. The experiment result proves that the method is effective.
出处
《现代电子技术》
2010年第18期97-100,共4页
Modern Electronics Technique
关键词
数据预处理
WEB挖掘
用户识别
路径补充
data pretreatment
Web mining
user identification
path completion