摘要
社会性标注作为一种新的资源管理和共享方式,吸引为数众多的用户参与其中,由此产生的大量社会性标注数据成为网页质量评价的一个新维度.文中研究如何利用社会性标注改进网页检索性能,提出一种有机结合网页和用户的查询相关性与互增强关系的网页排序算法.首先利用统计主题模型,使用相关标签为网页和用户建模,并计算查询相关性.然后利用二部图模型刻画网页和用户间的互增强关系,并使用相关标签与用户兴趣和网页内容的匹配度为互增强关系赋予权重.最后结合查询相关性和互增强关系,以迭代方式同时计算网页和用户的评分.实验结果表明,文中提出的检索模型和互增强模型能够有效地提高排序算法的性能.与目前的代表性算法相比,该算法在检索性能上有明显提高.
With the rapid development of social tagging systems,large amount of social annotations have been created by large crowd of collaborative users,forming a new dimension of accessing the quality of Web pages.This paper proposes a novel page ranking algorithm for improving Web search performance.The authors explored the social annotations by effectively combining the language model of pages and users with the mutual reinforcement between pages and users,developed a probabilistic generative model to demonstrate the tagging scheme of users and resources,and modeled the mutual reinforcement relation between pages and users with a bipartite graph.Moreover,the authors assigned each one of the mutual reinforcement relations with a weight representing the coherence between annotating tags and language model of pages and users,and computed the importance of pages and users simultaneously in an iterative fashion based on both query relevance and mutual reinforcement.Experiments on a dataset collected from a real-world social tagging system show that the query model and mutual reinforcement model developed in this paper can effectively improve the performance of the ranking algorithm,outperforms other state-of-the-art algorithms in retrieval performance measured by MAP and NDCG.
出处
《计算机学报》
EI
CSCD
北大核心
2010年第6期1014-1023,共10页
Chinese Journal of Computers
基金
国家自然科学基金(60703014
60933005)
国家"九七三"重点基础研究发展规划项目基金(G2007CB311100)
国家"八六三"高技术研究发展计划项目基金(2006AA010105-02
2007AA01Z416
2007AA01Z442
2009AA01Z437)资助~~
关键词
社会性标注
网页检索
网页质量
排序算法
主题模型
social annotations
page retrieval
page quality
ranking algorithm
topic models