期刊文献+

基于聚类分析策略的用户偏好挖掘 被引量:8

A Cluster-based Approach on Mining Text Preference
下载PDF
导出
摘要 利用训练文档集准确高效地挖掘隐藏的用户文本偏好和概念向量是文本信息过滤和多文档自动文摘等自然语言处理应用的关键技术之一。针对训练文本集中往往存在多个主题类别的问题,提出一种基于聚类分析策略的文本偏好挖掘方法。其基本思路是对训练文档集进行聚类处理,然后对同主题文档进行共性分析,并经过特征权值调整和特征约简,获得表示用户不同主题偏好的概念向量。实验结果表明该方法具有对用户的文本偏好刻画更加精确,对相关阈值变化不敏感等优点,可以与Rocchio等算法结合来进行用户兴趣建模。 It is one of the key technologies in NLP applications such as text information filtering and multi-document summarization to mine the hidden user text preference and concept vector from the training documents. To solve the problem of multitopic problem in training documents, an approach which is based on cluster analysis has been introduced . The basic idea is to classify the training documents firstly, then analyze the commonness of the documents on the same topic. After feature weight modification and feature reduction, the concept vectors on different topic are formed. The experiment results show that the approach can represent user text preference more precisely, and not sensitive to relevance threshold. User preference profile can be mined by combing the approach with Rocchio algorithm.
出处 《计算机应用研究》 CSCD 北大核心 2005年第12期21-23,共3页 Application Research of Computers
基金 国家自然科学基金资助项目(60373100) 国家"863"计划资助项目(2002AA117010-09)
关键词 偏好挖掘:文档聚类 概念向量 Rocchio算法 Preference Mining Document Clustering Concept Vector Rocchio Algorithm
  • 相关文献

参考文献6

  • 1Byoung-tak Zhang, Young-woo Seo.Personalized Web Documentfilte-ring Using Reinforcement Learning[J].Applied Artificial Intelligence,2001,15:665-685.
  • 2Bollacker K D, Lawrence S, Giles C L. Discovering Relevant Scientific Literature on the Web[J]. IEEE Intelligent Systems, 2000,15(2):42-47.
  • 3Joachims T, Freitag D, Mitchell T. WebWatcher: A Tour Guide for the World Wide Web[A]. Proceedings of the International Joint Conference on Artificial Intelligence[C]. San Francisco: Morgan Kaufmann Publishers, 1997. 770-777.
  • 4Rocchio J I.Relevance Feedback in Information Retrieval[A]. The SMART Retrieval System[M]. Prentice-Hall,1971.313-323.
  • 5A Strehl,J Ghosh, R Mooney.Impact of Similarity Measures on Web-page Clustering[C]. AAAI 2000 Workshop on AI for Web Search,2000.58-64.
  • 6Yunjae Jung.Design and Evaluation of Clustering Criterion for Optimal Hierarchical Agglomerative Clustering[D]. University of Minnesota.

同被引文献63

引证文献8

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部