期刊文献+

利用主题内容排序的伪相关反馈

Using Topic Content Ranking for Pseudo Relevance Feedback
下载PDF
导出
摘要 传统的伪相关反馈(pseudo relevance feedback,PRF)方法,将文档作为基本抽取单元进行查询扩展,抽取粒度过大造成扩展源中噪音量的增加。研究利用主题分析技术来减轻扩展源的低质量现象。通过获取隐藏在伪相关文档集(pseudo-relevant set)各文档内容中的语义信息,并从中提取与用户查询相关的抽象主题内容作为基本抽取单元用于查询扩展。在NTCIR 8中文语料上,与传统PRF方法和基于主题模型的PRF方法相比较,实验结果表明该方法可以抽取出更符合用户查询的扩展词。此外,结果显示从更小的主题内容粒度出发进行查询扩展,可以有效提升检索性能。 Traditional pseudo relevance feedback(PRF)algorithms use the document as a unit to extract words for query expansion,which will increase the noise of expansion source due to the larger extraction unit.This paper exploits the topic analysis techniques so as to alleviate the low quality of expansion source condition.Obtain semantic information hidden in the content of each document of pseudo-relevant set,and extract the abstract topic content information according to the relevance of the user query,which is described as a basic extraction unit to be used for query expansion.Compared with the traditional PRF algorithms and the PRF based on topic model algorithm,the experimental results on NTCIR8dataset show that the scheme in this paper can effectively extract more appropriate expansion terms.In addition,the results also show that the scheme in this paper has a positive impact to improve the retrieval performance on a smaller topic content granularity level.
作者 闫蓉 高光来 YAN Rong;GAO Guanglai(College of Computer Science, Inner Mongolia University, Hohhot 010021, China)
出处 《计算机科学与探索》 CSCD 北大核心 2017年第5期814-821,共8页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.61263037 内蒙古自然科学基金Nos.2014BS0604 2014MS0603~~
关键词 主题模型 主题内容 伪相关反馈 topic model topic content pseudo relevance feedback (PRF)
  • 相关文献

参考文献1

二级参考文献11

  • 1许云,樊孝忠,张锋.基于知网的语义相关度计算[J].北京理工大学学报,2005,25(5):411-414. 被引量:53
  • 2刘群 李素建.基于《知网》的词汇语义相似度计算[C]..第三界汉语词汇语义研讨会[C].台北,2002..
  • 3Mohammad S, Hirst G.Distributional measures as proxies for semantic relatedness[EB/OL]. (2005).http ://www.cs.to- ronto.edu/compling/Publications.
  • 4Budanitsky A, Hirst G.Evaluation WordNet-based mea- sures of lexical semantic relatedness[J].Computational Lin- guistics, 2006,32( 1 ) : 13-47.
  • 5Gao J F,Zhou M,Nie J ambiguity using a decaying Y.Resolving query translation model and syn- tactic dependence relations[C]//Proceedings of the 25th Annual International ACM search and Development in pere, Finland, 2002 : 183-190. SIGIR Conference on Re- Information Retrieval, Tam-.
  • 6Wang Hongling, Lv Qiang, Xu Rui, et al.Knowledge-based computational modeling on semantic relevancy between words[C]//Proceedings of the 7th International Conference on Chinese Computing, Wuhan, China, 2007 : 186-190.
  • 7董振东,董强.知网一知网简介[EB/OL].http://www.keen-age.com.
  • 8Gale W A,Church K W,Yarowsky D.One sense per discourse[C]//Proceedings of the 4th ARPA Speech and Natural Language Workshop.San Francisco: Morgan Kaufmann, 1992: 233-237.
  • 9闫蓉.基于语义相关度计算的汉语词义消歧方法研究[J].内蒙古大学学报(自然科学版),2007,38(6):693-697. 被引量:2
  • 10鲁松,白硕.自然语言处理中词语上下文有效范围的定量描述[J].计算机学报,2001,24(7):742-747. 被引量:47

共引文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部