摘要
Web用户访问过的网页以及在该网页上的浏览时间体现了用户的访问兴趣.为了更好的衡量任意两个用户访问模式之间的相似/相异度,每个用户访问模式都被转换成具有相等长度的模糊向量,其中每个元素要么是0要么是模糊语言变量,它体现了用户是否访问过该网页及在该网页上的浏览时间.由于类的边界可能是模糊的,因而使用粗糙k-均值法对这些代表用户浏览特征的模糊向量进行聚类.最后使用Davies-Bouldin指标来衡量聚类的效果.
The interest of web users can be revealed by their visited web pages and time durations on these web pages during their susfing. In order that similarity/difference between any two patterns can be easily gained, each web access pattern from web logs is transformed as fuzzy vector with same length, in which each element is a fuzzy linguistic variable or 0 representing the visited web page and time duration on this web page. The clusters may not exist crisp boundaries, thus a rough k-means clustering algorithm is proposed to group the fuzzy vectors denoting users' surfing behaviors. Finally, Davies-Bouldin index is provided to measure the clustering exactness.
出处
《系统工程理论与实践》
EI
CSCD
北大核心
2007年第7期116-121,共6页
Systems Engineering-Theory & Practice
基金
山西省自然科学基金(2006011039)