摘要
数据分析系统是Web日志挖掘系统的一个重要组成部分,是模式分析的前序步骤,主要包括数据预处理和模式挖掘两个过程。数据预处理包括数据净化、用户会话识别和路径补充;模式挖掘包括事务识别、关联规则分析、序列模式分析、分类分析和聚类分析。在研究传统的分析方法的基础上,结合了一种改进的路径补充算法,经验证表明分析效率显著提高。
The system of data analyzing is a vital part of the system of Web usage mining, and data analyzing is the pre - step of the process of patterns analyzing. The data analyzing system includes two processes: data preprocess and pattern mining. The former includes following processes: data cleaning, user recognition, session recognition and path supplementation; and the latter includes: transaction recognition, association rules analyzing, sequential pattern recognition, classification analyzing and clustering analyzing. In this system, except of citing traditional methods of analysis, chose the method of finding the common ancestor which is the nearest in the process of path supplementation. Result of the application make known this algorithm is quite good.
出处
《计算机技术与发展》
2007年第1期239-241,244,共4页
Computer Technology and Development
基金
安徽省高等学校省级自然科学研究项目(2005KJ065)
关键词
数据预处理
用户会话
事务
关联规则
模式挖掘
data preprocess
user session
transaction
association rules
pattern mining