摘要
文章提出了综合多重评价因素的Web用户聚类算法;首先从评价因素的数学特征出发,提出了Web资源偏爱度与Web资源关联度的概念,然后运用Kruskal算法的基本原理在由Web资源和Web访问行为所构成的无向图内寻找寻频繁路径,再根据频繁路径和Web资源偏爱度与关联度阈值对Web用户进行聚类处理。该算法在一定程度上提高了Web用户聚类算法的准确性与执行效率。
A new algorithm for Clustering Web-Users with Integrating Evaluating-Factors is proposed in this paper. First,by analyzing the mathematical characteristics of evaluating-factors,this paper proposes the concepts of preference and relation of Web resources.Then,we use the basic principle of Kruskal to search for frequent traversal paths in the undigraph composed of web resources and web-users' activities,and cluster Web-Users according to the frequent traversal path and the thresholds of preference and relation.This algorithm improves the accuracy and efficiency of the algorithms for clustering web-users.
出处
《计算机工程与应用》
CSCD
北大核心
2006年第28期147-149,210,共4页
Computer Engineering and Applications
关键词
评价因素
偏爱度
关联度
频繁路径
用户聚类
evaluating-factor,preference,relation,frequent path,cluster Web-user