摘要
查询词扩展要解决两个方面的问题:一是扩展词的来源,二是如何在来源集合里挑选扩展词项。对此,首先利用检索结果聚类和排序模型获取了较高质量的相关文档集合,并以此作为扩展源;然后结合XML文档的特点,通过词项间的局部共现特征进行查询扩展。相关实验结果表明,一方面,所采用的检索结果聚类和排序模型的相关文档集扩展源具有较高的用户查询相关性,相比传统的伪反馈扩展源,具有更高的质量;另一方面,提出的结合了XML结构特点的词共现查询扩展方案能获得与用户查询意图相关的扩展信息,与初始查询和无结构的词项扩展方法相比,所提方法能够更有效地提高搜索引擎检索性能。
The two problems should be solved in query expansion.One is the origin of the expanded terms and the other is to select appropriate expanded terms from the expansion source.Therefore,this paper proposed query expansion method,in which the high quality relevant documents set is firstly obtained based on xml search results clustering and ranking model and it is regarded as the expansion source,and then the local word co-occurrence model combing xml documents structure features is applied to select the expanded query.The experimental data have proved two sides.On the one hand,the proposed expansion source acquisition method has obtained more relevant documents and the source has higher quality than those of traditional pseudo relevant feedback.On the other hand,compared to original query and no structure method,the selected expanded terms based on local word co-occurrence with XML structural features are more relevant to user's query intension and lead to good performance in retrieval.
出处
《计算机科学》
CSCD
北大核心
2014年第4期200-204,214,共6页
Computer Science
基金
国家自然科学基金(61173146
61262035
61363039
71361012)
国家社会科学基金(12CTQ042)
江西省教育厅科技项目(GJJ11729
GJJ12734)资助
关键词
XML查询扩展
扩展源
词共现
XML结构
XML query expansion
Expansion source
Word co-occurrence model
XML structural feature