摘要
网页搜索结果的多样化已经逐渐成为提高网页搜索效率和用户满意度的一个重要因素.文中将多样化问题形式化为信息面覆盖率的最大化问题,提出了一种基于关键词的网页搜索结果多样化方法KDM.该方法首先从与用户查询相关的结果文档中提取出可以描述结果文档所蕴含的信息面的关键词,然后根据关键词的同现性以及关键词对文档的描述能力,计算出结果文档的信息面新颖度,最后结合新颖度和相关度对文档进行重新排序,从而给用户提供多样化的搜索结果.实验结果表明,文中方法的多样化性能优于现有的其它多样化方法.
The diversification of Web search results has been known as an important factor of improving Web search efficiency and user satisfaction.In this paper,the diversification problem is formalized into a maximization problem of facet coverage,and a novel diversification method named KDM is proposed.In KDM,first,Keywords representing document facets are extracted from the retrieved documents related to the query.Then,the document facet novelty is calculated according to the co-occurrence and description ability of the Key words.Finally,the documents are re-ranked by considering both the novelty and the relevance to provide diversified search results for users.Experimental results indicate that KDM outperforms other existing approaches in terms of diversification ability.
出处
《华南理工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2011年第5期102-107,共6页
Journal of South China University of Technology(Natural Science Edition)
基金
广东省自然科学基金资助项目(07006474
9451064101003233)
广东省科技攻关项目(2007B010200044)
华南理工大学中央高校基本科研业务费专项资金资助项目(2009ZM0125
2009ZM0189)
关键词
信息检索
关键词
检索结果
多样化
重排序
information retrieval
keyword
retrieval result
diversification
re-ranking