期刊文献+

Web日志数据挖掘中数据预处理模型的研究与建立 被引量:9

Research and Build of Data Preprocessing Model in Web Data Mining
下载PDF
导出
摘要 数据的质量直的好坏接关系到数据挖掘的结果,因此数据预处理是Web日志数据挖掘首要的步骤,基于前期提出的几种数据预处理方法,提出了通用的Web日志挖掘的数据预处理模型,将Web日志数据的预处理分为5个步骤:数据清理、用户标识、会话标识、路径补充和格式化,并结合旅游网站进行了实例验证。证明该数据预处理模型是完全可行,并且具有良好的通用性和可扩展性。 Data preprocessing is the chief process in the Web log mining. This article advances a kind of popular model of data preprocessing, which divides data preprocessing into five steps: data cleaning, user identification, session identification and format conversion. This model has been tested on the tour Web,which has been proved to be completely feasible,good versatility and extensibility.
出处 《现代电子技术》 2007年第4期103-105,共3页 Modern Electronics Technique
关键词 WEB日志挖掘 数据挖掘 数据预处理 用户标识 会话标识 Web log mining data mining data preprocessing user identification session identification
  • 相关文献

参考文献4

二级参考文献15

  • 1(加)HanJ KamberM.数据挖掘概念与技术[M].北京:机械工业出版社,2001..
  • 2Büchner AG, Mulvenna MD. Discovering Internet Marketing Intelligence through Online Analytical Web Usage Mining [ J]. ACM SIGMOD Record, 1998,27(4) :54 -61.
  • 3Cooley R,Tan Pangning,Srivastava J.Discovery of Interesting Usage Patterns from Web Data.Lecture Notes in Computer Science,Springer-Verlag,2000,1836: 163-182
  • 4Cooley R,Mobasher B,Srivastava J.Data Preparation for Mining World Wide Web Browsing Patterns.Journal of Knowledge and Information Systems,1999,1(1):5-32
  • 5Mobasher B,Cooley R,Srivastava J.Automatic Personalization Based on Web Usage Mining.Communications of the ACM,2000,43(8):142-151
  • 6Buchner A,Mulvenna M D.Discovering Internet Marketing Intelligence Through Online Analytical Web Usage Mining.SIGMOD Record,1998 ,27(4):54-61
  • 7Kamdaf T,Joshi A. On Creating Adaptive Web Servers Using Web Log Mining[ EB/OL ]. http ://citeseer. nj. nec. com/kamdm00creating.html,2002.
  • 8Nanopoulos A, Katsaros D, Manolopoulos Y. Effective Prediction of Web-user Aeeesses:A Data Mining Approach[ EB/OL]. http ://citeseer. nj. nee. eom/nanopoulos01 effective. html,2001.
  • 9Bartolini G, Redpath R. Web Usage Mining and Discovery of Association Rules from H'ITP Servers Logs [ EB/OL ]. http ://www. plato.linux. it/2 gbartolini/pdf/wum. pdf,2001.
  • 10[加]HartJ KamberM.数据挖掘概念与技术[M].北京:机械工业出版社,2001..

共引文献90

同被引文献90

引证文献9

二级引证文献18

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部