期刊文献+

降级Web Spam的可信度链接分析算法

Use trust based link analysis algorithm to degrade web spam
下载PDF
导出
摘要 以降级WebSpam算法为研究内容,在分析TrustRank等算法的优点和不足的基础上,提出了时间可信度的概念刻画不同时间下页面的可信度,引入了CreditRank算法来计算页面的可信度。同时,引入了LinkRank算法来计算基于链接的页面质量,整合权威度和可信度。定性分析和实验结果表明,CreditRank算法扩展了种子的利用范围,提高了反Spam算法的覆盖度,LinkRank算法能够解决"无辜页面"问题,并取得很好的降级Spam的效果。 The research is focus on spam-degrade algorithms. TmstRank algorithm's advantage and disadvantage are analyzed, and a new TimeCredit theory is proposed to describe a page's credit value at different times, the CreditRank algorithm is raised to calculate the credit value, a new link based analysis method named LinkRank is also put forward to integrate pages' authority and credence. Analysis and experiment show the fact that the CreditRank expands the seeds' valid area and improves anti-spare algorithm's recall. The experiment also shows that the LinkRank can avoid creating innocent pages, and is highly successful in degrading web spam.
出处 《计算机工程与设计》 CSCD 北大核心 2009年第10期2350-2353,共4页 Computer Engineering and Design
关键词 可信度 TrustRank LinkRank CreditRank WEB SPAM trust TrustRank LinkRank CreditRank web spam
  • 相关文献

参考文献10

  • 1Silverstein C,Henginger M,Marais J,et al.Analysis of a very large AltaVista query log[C].SIGIR Forum, 1999.
  • 2Henzinger M R,Motwani R, Silverstein C.Challenges in web search engines[C].SIGIR Forum,2002.
  • 3Gyongyi Z,Garcia-Molina H.Web spam taxonomy[C].Chiba,Japan: First International Workshop on Adversarial Information Retrieval on the Web(AIRWeb),2005.
  • 4Gyongyi Z,Garcia-Molina H,Pedersen J.Combating web spam with TrustRank[C]. Toronto,Canada:Proceedings of the 30th International Conference on Very Large Data Bases(VLDB),2004: 271-279.
  • 5Wu B,Goel V, Davison B D.Topical TrustRank: Using topicality to combat web spam[C].Proceedings of the 15th International World Wide Web Conference.Edinburgh, Scotland:ACM Press, 2006.
  • 6Guha R,Kumar R, Raghavan P, et al.Propagation of trust and distrust[C].Proceedings of the 13th International World Wide Web Conference,2004.
  • 7Wu B,Goel V, Davison B D.Identifying link farm spam pages[C]. Chiba,Japan:ACM Press,2005:10-14.
  • 8WFlake G, Lawrence S,Lee Giles C. Efficient identification of web communities[C].KDD 2000,Boston,MA USA:ACM Press, 2000.
  • 9Page L,Brin S,Motwani R,et al.The PageRank citation ranking: Bringing order to the web[R].Stanford University, 1998.
  • 10Gulli A,Signorini A.The indexable web is more than 11.5 billion pages [C]. Poster Proceedings of the 14th International Conference on World Wide Web.Chiba,Japan:ACM Press,2005:902-903.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部