摘要
查询扩展是信息检索领域中的一个热门话题,其目标是将与初始查询词相关的其他单词添加到初始查询请求中,以更详细地描述用户的信息需求.本文将查询过程视为特殊的问答过程,并基于此思想提出一种新的查询扩展方法.本文的贡献主要有以下几点:① 借助统计语言模型从大规模问答对数据中挖掘单词之间的扩展关系,并根据单词间的扩展关系对候选扩展词进行评级;② 提出一个新的查询扩展词选取策略,以克服已有查询扩展方法仅依赖评级的扩展词选取策略的不足.通过在真实数据集合上的实验,证明本文提出的查询扩展方法可以取得优于传统方法的性能,具有一定的实用性.
Query expansion technique is a hot topic in information retrieval. The aim of this technique is add terms in the original query to form a suitable query. In this article, we view the question part and the answer part of Q&A pairs in a huge Q&A archive as query and the webpages that satisfied the information need in the query. The contribution of this paper is: (1) using a statistical language model to mine the expansion probability between words, and (2) propose a candidate expansion words selection approach to form the new query to avoid the shortcomings of the prior query expansion methods. The experimental results on a real data set show that our approach performs better than the traditional query expansion techniques.
出处
《情报学报》
CSSCI
北大核心
2012年第4期407-415,共9页
Journal of the China Society for Scientific and Technical Information
基金
国家社科基金项目(10BTQ046)
国家科技支撑计划(2009BAK65B05)
中国博士后科学基金资助项目(20110491139).
关键词
查询扩展
信息检索
问答数据
语言模型
query expansion, information retrieval, Q&A data, language model