摘要
微博搜索排序是近年来微博研究的热点之一。对于任意一个话题,它内容的生产者很容易达到成千上万个,甚至更多,产生的微博数更是不计其数,同时,也给关键字搜索的微博排序提出了更大的挑战。因此,本文提出了基于话题的用户权威值计算方法、基于WordNet的内容语义相似度方法,以及基于LDA的方法将输入关键词和所召回微博与其所属话题相关联,使用LearningToRank监督学习方法,学习一种排序策略。在此基础上,对提出的方案在实际数据集上分别对用户话题权威性、微博内容语义相似度、以及综合排序因素进行验证。
Microblog ranking is one of the hot research area in recent years. For any one topic, it is easy to reach thousands of producers or even more, the number of micro-blogs is countless, but also it comes with a greater challenge during searching keywords in micro-blog. In view of this, we proposed to incorporate topical authority of user, content similarity based on WordNet and topical relevance based on LDA algorithm between search keywords and microblogs that recalled to enhance the performance of microblog ranking with learning to rank related algorithm. On this basis, the user's topic authority,micro-blog content semantic similarity as well as the integrated ranking factors in a proposed project were verified on the actual data set.
出处
《山东农业大学学报(自然科学版)》
CSCD
2016年第3期469-472,共4页
Journal of Shandong Agricultural University:Natural Science Edition
关键词
微博排序
语义相似度
特征拟合
Microblog ranking
semantic similarity
feature fitting