期刊文献+

Web访问挖掘的预处理技术的研究 被引量:19

Research on Data Preprocessing Technology in Web Log Mining
下载PDF
导出
摘要 Web日志挖掘就是运用数据挖掘技术从Web日志中发现和抽取信息的过程。数据预处理是Web日志挖掘的一个关键环节。对数据预处理的各个环节进行研究,并介绍各个环节中的一些特殊处理方法,根据对Web服务期日志数据格式的分析,对会话概念进行了形式化描述,然后在分析目前会话构造算法的基础上,提出了基于时间和引用的启发式方法来构造会话。 Web log mining is a process that using data mining technology to find and extract information from Web log. Data preprocessing plays a key role in the process of Web log mining.This paper mainly researches'all links of data preprocessing,introduces the solution of some especial problems in this process. By the analysis of Web server log format,give the formal descriptions of the concept of session. On the basis of analyzing the current session construction methods,mainly proposes the time - referrer - based heuristic method that can be used to eonstruet sessions.
出处 《计算机技术与发展》 2007年第8期11-14,18,共5页 Computer Technology and Development
关键词 WEB挖掘 WEB日志挖掘 数据预处理 用户会话 会话识别 Web mining Web log mining data preprocessing user session session identification
  • 相关文献

参考文献7

  • 1Han Jiawei,Kamber M.Data Mining[M].Beijing:Higher Education Press,2000.
  • 2Serivastava J,Cooley R,Deshpande M,et al.Web Usage Mining:Discovery and Applications of Usage Patterns from Web Data[J].ACM SIGKDD Explorations,2000,1 (2):12-23.
  • 3Spiliopoulou M,Mobasher B,Berendt B,et al.A framework for the evaluation of session reconstruction heuristics in Web usage analysis[J].Informs Journal on Coumputing,2003,15(5):171-179.
  • 4Baglioni M,Ferrara U,Romei A,et al.Preprocessing and mining Weblog data for Web personalization[C]//Proceedings of 8th Natl' conf of the Italian Association for Artificial Intelligence.Pisa,Italy:[s.n.],2003.
  • 5赵伟,何丕廉,陈霞,谢振亮.Web日志挖掘中的数据预处理技术研究[J].计算机应用,2003,23(5):62-64. 被引量:62
  • 6Wang Xidong,Ouyang Yiming,Hu Xuegang,et al.Discovery of User Frequent Access Patterns on Web Usage Mining[C]//In:The 8th International Conference on Computer Supported Cooperative Work in Design Proceedings.[s.l.]:IEEE,2003.
  • 7张娥,郑斐峰,冯耕中.Web日志数据挖掘的数据预处理方法研究[J].计算机应用研究,2004,21(2):58-60. 被引量:31

二级参考文献6

  • 1(加)HanJ KamberM.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
  • 2[1]L Catledge,J Pitkow.Characterizing Browsing Strategies in the World Wide Web[J].Computer Networks and ISDN Systems, 1995,27(6): 1065-1073.
  • 3[2]J Pitkow.Summary of Characterizations[C].7th International World Wide Web Conference,1998.611-628.
  • 4[3]Bamshad Mobasher,et al.Grouping Web Page References into Transactions for Mining World Wide Web Browsing Patterns[C].Proceedings of the 1997 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX-97),1997.108-132.
  • 5[4]Platform for Priveacy Project[EB/OL].http://www.w3.org/P3P/, 2000.
  • 6Büchner AG, Mulvenna MD. Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining [ J]. ACM SIGMOD Record, 1998,27(4) :54 -61.

共引文献91

同被引文献121

引证文献19

二级引证文献62

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部