摘要
为了得到准确有效的用户聚类,提出了一种基于关键字的用户聚类算法。该算法是在传统Rock算法的基础上进行了改进,提出了相似权重和平均邻居的概念,并且将用户关键字事务集的平均邻居数定义为用户访问模式相似性的标准。在不产生离群用户点的基础上,缩小了用户聚类的范围,将一个大的用户聚类更加精确的划分为几个小的用户聚类。利用用户之间的相似度阈值对数据进行过滤,减小了用户聚类的计算量。经过实验验证该算法有效的提高了相似用户聚类的准确性和运行效率。
To get the accurate user clustering,a user clustering algorithm based on the keyword is presented.Based on the foundation of the traditional Rock algorithm,the concept of the similarity weight and average neighbors are proposed.The average number of neighbors of the user Key words transactions is defined as the user access pattern similarity criteria.The big user clustering is divided into more several small user clustering,so the scope is narrowed based on no outliers user points.The user similarity threshold is used for data filtering,thus the calculation amount of user clustering is reduced.The experiments show that the algorithm can effectively improve the accuracy and the efficiency of the similar user clustering.
出处
《计算机工程与设计》
CSCD
北大核心
2012年第9期3553-3557,3568,共6页
Computer Engineering and Design
基金
国家自然科学基金项目(51075423
61105045)
北京市属市管高等学校人才强教计划基金项目(PHR20100509
PHR201108057)
北京市优秀人才培养基金项目(2009D005002000009)
关键词
关键字
相似权重
平均邻居
相似度
用户聚类
keyword
similarity weight
average neighbors
similarity
user clustering