摘要
数据预处理在Web日志挖掘过程中起着至关重要的作用,直接影响日志挖掘的质量和结果。分析了数据预处理的主要过程,并用站点首页结合动态时间阈值的方法对会话识别进行了改进。实验结果表明,改进后的会话识别方法能更有效地识别出用户的真实会话。
Data preprocessing plays a vital role in Web log mining process, it directly affects the quality and results of Web log mining. Main process of data preprocessing is analyzed, and the method of session identification through to use Website home page with dynamic time threshold is improved. The experimental results show that the method can identify more user real session.
出处
《科学技术与工程》
北大核心
2012年第8期1928-1930,1935,共4页
Science Technology and Engineering