基于聚类的Web用户访问模式的算法研究

Research of the Cluster Algorithm based on Web Customer Access Model

下载PDF

导出

摘要用户对Web站点的访问代表了用户对Web站点上页面的访问兴趣,这种兴趣程度可以通过用户对Web站点上页面的浏览顺序及页面上的浏览时间表现出来.通过对Web用户访问路径的分析,提出一种基于浏览路径及浏览时间的相似度的度量方法.然后,把粗糙度的概念引入Leader聚类算法中,提出粗糙Leader聚类算法.最后使用标准数据集进行了试验,证明基于此种相似度计算方法,应用粗糙Leader聚类算法Web用户的有效性. Tbe access of the users about a Web site represents the interest of users in the Web pages of the Web site. Each user＇ s interest can be manifested by the sequence of each user access and access time in the Web. By analyzing the access path of Web user, similarity based on the sequence of each user access and access times can be put forward. Then, the concept of rough approximations is introduced in Leader cluster algorithm and the rough cluster algorithm based on Leader is suggested. Finally, the performance of the rough Leader cluster algorithm is tested and analyzed by benchmark based on the novel method to computing the similarities of the web user＇ s access patterns.

作者郭淑红雷梁

机构地区华中科技大学控制科学与工程系信阳农业高等专科学校计算机系信阳师范学院计算机与信息技术学院

出处《信阳师范学院学报（自然科学版）》 CAS 2009年第1期137-141,共5页 Journal of Xinyang Normal University(Natural Science Edition)

基金河南省教育厅科技计划项目(2006520011)

关键词聚类相似度 Leader算法用户访问模式 clustering, similarity, Leader algorithm, customer access model

分类号 TP301.66 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献8

1Foss A, Wang W, Zaiane O R. A non-parametric approach to web log analysis[ C ]//Proeeedings of Workshop on Web Mining in First International SIAM Conference on Data Mining , 2001.
2Asharaf S. A rough fuzzy approach to web usage categorization [ J ]. Fuzzy Set and Systems ( S O165-0114 ), 2004,148 ( 1 ) : 119-129.
3Han J W. Extensions to the k-means algorithm for clustering large data sets with categorical values[J]. Data Mining and knowledge Discover( S 1384-5810), 1998, 2 ( 1 ) :283-304.
4王实,高文,李锦涛,谢辉.路径聚类:在Web站点中的知识发现[J].计算机研究与发展,2001,38(4):482-486. 被引量：59
5Luotonen A. The common log file format[ EB/OI,]. [ 2008-05-15 ]. http ://www. w3. org/pub/WWW/.
6张琼,张莹,白清源,谢丽聪,谢伙生.一种新的基于粗糙集的leader聚类算法[J].计算机科学,2008,35(3):177-179. 被引量：4
7Lingras P. Interval set clustering of web users with Rough k-Means[J]. Journal of Intelligent Information System ( S 0925-9902 ), 2004,23 ( 1 ) : 5-16.
8马力,焦李成,刘国营.一种基于路径聚类的Web用户访问模式发现算法[J].计算机科学,2004,31(8):140-141. 被引量：10

二级参考文献15

1[1]Chen M S,Park J S,Yu P S. Data mining for path traversal patterns in a Web environment. In: Proc of the 16th intl. Conf. on Distributed Computing Systems. Hong Kong, 1996. 385～392
2[2]Han J W. Extensions to the K-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discover,1998,2(1) :283～304
3[3]Mobasher B,Cooley R ,et al. Creating adaptive Web sites through usage-based clustering of URLs. In : proc. of the 1999 IEEE Knowledge and Data Engineering Exchange Workshop (KDEX'99). New York :IEEE Press, 1999.32～37
4[4]Shahabi C,Zarkesh A M,Adibi J. et al. Knowledge discovery from users Web-page navigation. In:Proc. of Workshop on Research Issures in Data Engineering. Birmingham, 1974.44～51
5[5]Yan T,Jacobesn M,Garcia-Molina H, et al. Trom user access patterns to dynamic hypertext linking. In:Proc. of the 5th intl. WorldWide Web Conf. Paris,1996.27～36
6[6]Nasraoui O,Frigui H,Joshi A, et al. Mining Webaccess logs usingrelational competitive fuzzy clustering. In: Proc. of the 8th Fuzzy Systems Association World congress. London: Springer-Verlag,1999
7[7]Perkowitz M,Etzioni O. Adaptive Web sites:Automatically synthesizing Web pages. In : Proc. of AAAI98 Madison: AAAI Press,1998. 35～40
8[8]Luotonen A. The commom log file format. 1995. http://www.w3. org/pub/www/
9Yan T，Proc of the 5th Int World Wide Web Conf，1996年，27页
10Lingras P. Rough set clustering for Web mining. IEEE, 2002. 1039-1044

共引文献66

1吕佳.Web日志挖掘技术应用研究[J].重庆师范大学学报（自然科学版）,2006,23(4):39-44. 被引量：15
2徐勇,张利宏,张慧.基于web的肺结核病人规范管理转诊系统的设计与实现[J].科技信息,2008(24):424-425. 被引量：4
3周密,董其军.基于用户信息活动的智能数字图书馆研究[J].图书馆学研究,2002(8):59-62. 被引量：3
4业宁,李威,梁作鹏,董逸生.一种Web用户行为聚类算法[J].小型微型计算机系统,2004,25(7):1364-1367. 被引量：20
5郭岩,白硕,于满泉.Web使用信息挖掘综述[J].计算机科学,2005,32(1):1-7. 被引量：50
6汤晓兵,贾智平.基于特征事务元素的用户事务聚类方法研究[J].微电子学与计算机,2005,22(2):85-87. 被引量：4
7金民锁,刘红祥,王佐.基于隐马尔科夫模型的浏览路径预测[J].黑龙江科技学院学报,2005,15(3):167-170. 被引量：2
8ZhuMingfu ZhangHongbin SongFangyun.A novel clustering and supervising users' profiles method[J].Journal of Systems Engineering and Electronics,2005,16(2):456-459.
9刘国营.基于路径聚类的Web用户访问模式发现算法[J].情报杂志,2005,24(7):18-19. 被引量：2
10钱立三.WEB日志挖掘在远程开放教育中的应用[J].安徽广播电视大学学报,2005(3):116-118. 被引量：3

1殷钢,苗夺谦,段其国.一种新的粗糙Leader聚类算法[J].计算机科学,2009,36(5):203-205. 被引量：6
2张琼.基于粗糙集的改进Leader聚类算法[J].江苏师范大学学报（自然科学版）,2015,33(4):50-52. 被引量：1
3张琼,张莹,白清源,谢丽聪,谢伙生.基于Leader的K均值改进算法[J].福州大学学报（自然科学版）,2008,36(4):493-496. 被引量：3
4刘顺来.基于聚类分析的Web信息搜索算法研究[J].电脑与电信,2007(6):53-56.
5郑富兰,杨勇,贺丽春.一种有效的Web用户访问模式聚类算法[J].山西师范大学学报（自然科学版）,2013,27(1):39-42.
6赵湘宁.一种基于长链竞争机制的传感器网络能量空洞研究算法[J].计算机科学,2016,43(7):125-130. 被引量：3
7曹棣,孔晓斌.基于改进Leader算法的Web存取模式的聚类[J].太原科技大学学报,2011,32(3):189-192.
8吴琼,高洁,王明哲.结合Petri网理论的网格时间表现检测与分析[J].舰船电子工程,2006,26(2):38-41.
9隋玉敏,孙秀芳,武优西,任志考.负投影梯度的特征权重Leader聚类算法[J].小型微型计算机系统,2014,35(9):2147-2150. 被引量：2
10张琼,张莹,白清源,谢丽聪,谢伙生.一种新的基于粗糙集的leader聚类算法[J].计算机科学,2008,35(3):177-179. 被引量：4

信阳师范学院学报（自然科学版）

2009年第1期

浏览历史

内容加载中请稍等...

基于聚类的Web用户访问模式的算法研究

参考文献8

二级参考文献15

共引文献66

相关作者

相关机构

相关主题

浏览历史