摘要
查询扩展是一种改善信息检索召回率的重要技术。该文根据维基百科和搜索引擎各自的优点来实现查询词的扩展,试图提高检索结果top N的准确率。由于维基百科篇章中存在着大量的超链接,这些超链接中包含着与主题紧密相关的词条,通过提取这些词条,来实现基于维基百科的扩展。实验基于搜索引擎伪相关反馈的查询扩展作为baseline,分别对单语扩展系统和中英文跨语言扩展系统进行检测。实验结果表明本文的方法相比baseline系统,单语系统中MAP值提高6.41%,跨语言系统中Top10-precision值提高10.90%。
Query expansion is a well-known technique to increase recall value.In this paper,according to the respective disadvantage of Wikipedia and search engine,we realize the method of query expansion.Since there are lots of hyperlink terms in the articles of Wikipedia, which includes the terms contacting with the subject.The experiment use the query expansion of PRF based the search engine as the baseline,and evaluate the monolingual expansion system and the English-Chinese Cross-language expansion system..The Results of experiment show that the improvements of our method over the baseline are 6.41%as MAP value in the monolingual expansion system and 10.90%as Top10-precision in the Cross-language expansion system..
出处
《电脑知识与技术》
2011年第2X期1217-1221,共5页
Computer Knowledge and Technology
基金
国家自然科学基金60970057
关键词
查询扩展
维基百科
搜索引擎
伪相关反馈
跨语言信息检索
query expansion
wikipedia
search engine
pseudo relevance feedback
cross-language information retrieval