期刊文献+

一种新的Web用户群体和URL聚类算法的研究 被引量:11

Research on a new clustering algorithm of Web user communities and Web site's URLs
下载PDF
导出
摘要 提出一个基于Web日志的Web用户群体和站点URL聚类算法.使用用户浏览行为描述和用户浏览时间离散化方法建立了Web站点的用户事务矩阵,并在此基础上对Web用户群体和站点URL进行聚类.由于在聚类过程中同时考虑了用户对URL的浏览时间和访问次数,使算法的精度和效率都大大提高.同时,该算法能较好地处理类间重叠问题,使算法具有较好的实用性.最后对算法的有效性和可伸缩性进行了研究. By using new methods which are based on Web user's browsing behavior characterization and user's viewing time discretization, a new clustering algorithm for Web user communities and Web site's URLs is proposed. Web user access matrixes are set up on the preparation of Web logs. By considering user's viewing time and number of hits to Web site's URLs simultaneously, the accuracy and efficiency of the clustering algorithm are increased. The improved algorithm could solve the problem of the partial overlap bewteen clusters, which makes the algorithm more practical. The effectiveness and the sealability of the algorithm are studied through the experiments.
出处 《控制与决策》 EI CSCD 北大核心 2007年第3期284-288,共5页 Control and Decision
基金 国家自然科学基金项目(60173058)
关键词 WEB使用挖掘 用户浏览模式 用户访问矩阵 用户事务聚类 站点URL聚类 Web usage mining User browsing pattern User access matrix User session clustering Web site URL clustering
  • 相关文献

参考文献10

  • 1Srivastava J,Cooley R,Deshpande M,et al.Web usage mining:Discovery and applications of usage patterns from web data[J].SIGKDD Explorations.2000,1(2):12-23.
  • 2Cooley R.Mobasher B,Srivastava J.Data preparation for mining world wide web browsing patterns[J].Knowledge and Information Systems,1999,1(1):5-32.
  • 3Mobasher B,Cooley R.Creating adaptive Web sites through Usage-based clustering of URLs[C].Proc of the 1999 IEEE Knowledge and Data Engineering Exchange Workshop.New York:IEEE Press,1999:32-37.
  • 4Paliouras G,Papatheodorou C,Karkaletsis V,et al.Clustering the users of large web sites into communities[C].Proc of the 17th Int Conf on Machine Learning.San Mateo:Morgan Kaufmann,2000:719-728..
  • 5Fu Y,Sandhu K,Shih M.A generalization-based approach to clustering of Web usage session[C].Web Usage Analysis and User Profiling.New York:Springer-Verlag,2000:21-38.
  • 6Shahabi C,Zarski A M,Shah J.Knowledge discovery from users web-page navigation[C].Proc of 7th Int Conf on Research Issues in Data Engineering.Birmingham:IEEE Computer Society Press,1997:20-29.
  • 7苏中,马少平,杨强,张宏江.基于Web-Log Mining的Web文档聚类[J].软件学报,2002,13(1):99-104. 被引量:29
  • 8Perkowitz M,Etzioni O.Adaptive websites:Automatically synthesizing Web pages[C].Proc of AAAI 98.Madison:AAAI Press.1998:35-40.
  • 9Catledge L,Pitkow J.Characterizing browsing behaviors on the world wide Web[J].Computer Networks and ISDN Systems.1995:27(6):1065-1073.
  • 10Cooley R,Mobasher B,Srivastava J.Grouping web page references into transactions for mining worldwide Web browsing patterns[C].Proc Knowledge and Data Engineering Workshop.Newport Beach.CA:IEEE Press.1997:2-9.

二级参考文献6

  • 1Ng, R., Han, J. Efficient and effective clustering methods for data mining. In: Bocca, J.B., Jarke, M., Zaniolo, C., eds. Proceedings of the 1994 International Conference on Very Large Data Bases (VLDB'94). Santiago, Chile: Morgan Kaufmann, 1994. 144~155.
  • 2Ester, M., Kriegal, H.P, Sander, J. A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, Evangelos, Han, Jia-wei, Fayyad, U.M., eds. KDD'96--Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining. AAAI Press, 1996.
  • 3Kaufman, L., Rousseeuw, P. J. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.
  • 4Sibson, R. SLINK: an optimally efficient algorithm for the single-link cluster method. The Computer Journal, 1973,16(1):20~34.
  • 5Bouguettaya, A. On-Line clustering. IEEE Transactions on Knowledge and Data Engineering. 1996,8(2):333~339.
  • 6Voorhees, E.M. Implementing agglomerative hierarchical clustering algorithms for use in document retrieval. Information Processing and Management, 1986,22:465~476.

共引文献28

同被引文献104

引证文献11

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部