摘要
针对PageRank算法在博文排序中的主题漂移和轻视新博文、重视旧博文的不足,以及存在与用户查询相关的博文并不靠前的问题,提出一种多特征融合的博文排序算法。该方法在分析博客自身结构特征的基础上,通过两链接博文的内容相似度和结构相似度以及博文的时间新鲜度和博主的受欢迎程度,得到博文的分数。实验结果证明,该算法性能优于传统的博文排序算法。
PageRank algorithm has some deficiencies when applied to blog posts sorting, such as topic drift, slighting the new blog posts but valuing the old ones ; there is also the problem that the blog posts correlated to user query do not listed in front. In light of these, a blog posts ranking algorithm is presented based on multi-feature fusion. On the basis of analysing the structural features of blog its own, and through the content similarity and structure similarity which link two blog posts, as well as the time freshness of the blog posts and the popularity of the bloggers, the algorithm gets a score of the blog post. Experimental result proves that the performance of the new algorithm is more efficient than the traditional ones.
出处
《计算机应用与软件》
CSCD
北大核心
2013年第7期224-227,共4页
Computer Applications and Software
基金
甘肃联合大学科研能力提升计划一般项目(2012YBTS05)