期刊文献+

基于网页概率潜在语义信息的用户兴趣聚类 被引量:2

User's interest clustering based on webpage probabilistic latent semantic information
下载PDF
导出
摘要 为了能准确挖掘用户兴趣点,首先利用概率潜在语义分析PLSA模型将"网页-词"矩阵向量投影到概率潜在语义向量空间,并提出"自动相似度阈值选择"方法得到网页间的相似度阈值,最后提出将平面划分法与凝聚式层次聚类相结合的凝聚式层次k中心点HAK-medoids算法,实现用户兴趣点聚类。实验结果表明,与传统的基于划分的算法相比,HAK-medoids算法聚类效果更好。同时,提出的用户兴趣点聚类技术在个性化服务领域可提高个性化推荐和搜索的效率。 To mine user's interests accurately, probabilistic latent semantic analysis (PLSA) model is firstly used to project webpage-word matrix vector into probabilistic latent semantic vector space. A method of "auto-selected similarity threshold" is proposed to get web pages similarity threshold. At last, combined with divisiory algorithms and hierarchical agglomerative clustering, a hierarchical agglomerative k-medoids clustering algorithm is proposed to realize cluster user's interests. The experimental results show that, compared with the traditional divisiory algorithms, the hierarchical agglomerative k- medoids algorithm has a better clustering effect. Furthermore, user's interest clustering technique can improve the efficiency of personalized recommendation and search in user' personalized service fields. Key words.probabilistic latent semantic analysis; auto-selected similarity threshold; user's interest
出处 《计算机工程与科学》 CSCD 北大核心 2014年第4期765-771,共7页 Computer Engineering & Science
基金 国家自然科学基金资助项目(61103129) 江苏省科技支撑计划资助项目(BE2009009)
关键词 概率潜在语义分析 自动相似度阈值选择 用户兴趣点 凝聚式层次k中心点 个性化服务 probabilistic latent semantic analysis auto-selected similarity threshold user's interestpoints hierarchical agglomerative k-medoids personalized service
  • 相关文献

参考文献10

二级参考文献111

  • 1赵林,胡恬,黄萱菁,吴立德.基于知网的概念特征抽取方法[J].通信学报,2004,25(7):46-54. 被引量:17
  • 2杨芳,杨振山.基于语义网技术的主题词自动标引[J].计算机工程与设计,2005,26(10):2837-2839. 被引量:4
  • 3苏亮,聂峰光,郭力,李晓霞,梁春燕.隐含语义检索系统词条权重的处理[J].计算机与应用化学,2005,22(11):972-976. 被引量:4
  • 4唐杰,梁邦勇,李涓子,王克宏.语义Web中的本体自动映射[J].计算机学报,2006,29(11):1956-1976. 被引量:96
  • 5董强,郝长伶,董振东.基于知网的中文信息结构抽取[EB/OL].( 2005-11 - 10 ) [ 2006-04-12]. http ://www. keenage, com.
  • 6Blei D M, Ng A Y, Jordan M I. Latent dirichlet allocation[J]. Journal of Machine Learning Research, 2004, 64(3): 993-1 022.
  • 7Hofmann T. Probabilistic latent semantic analysis [C]//15th Annual Conf on Uncertainty in Artificial Intelligence. San Francisco.. IEEE, 1999: 289-296.
  • 8Ding C H. A probabilistic model for latent semantic indexing[J]. Journal of the American Society for Information Science and Technology, 2005, 56 (6) : 597-608.
  • 9Hofmann T. Unsupervised learning by probabilistic latent semantic analysis [J ]. Machine Learning, 2001, 42(2):177-196.
  • 10Lloyd R, Shakiban C. Improvements in latent semantic analysis[J]. American Journal of Undergraduate Research, 2004, 3(2) : 123-137.

共引文献465

同被引文献20

引证文献2

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部