期刊文献+

一种反Web Spam页面的方法

A Method for Combating Web Spam Pages
下载PDF
导出
摘要 最近,spam页面急剧增加,这极大的影响了搜索引擎的精度和效率。如何抵御spam页面已经成为一个非常重要的问题。合并基于内容来侦测spam页面和基于链接spam侦测spam页面的方法,提出一个两步侦测spam页面的方法。第一步是一个过滤的步骤,用于生成spam页面的候选列表;第二步,通过一个自动的分类器从候选页面中侦测出最终的spam页面。 Recently, the amount of web spam has increased dramatically and this influences the precision and efficiency of search engine greatly. How to combat web spam has become an important problem. This paper proposes an automated two-step method to detect web spam combined the methods based on content analysis and the methods based on link spam. The first step is a filtering step, which generates a candidate list of web spam. In the second step, a classifier is used to detect web spam from the candidates generated by the filtering step.
作者 蒋涛 张彬
出处 《计算机与数字工程》 2007年第11期76-78,152,共4页 Computer & Digital Engineering
关键词 垃圾网页 TrustRank 链接spam Web spam,TrustRank,link spam
  • 相关文献

参考文献10

  • 1M. R. Henzinger, R. Motwani, and C. Silverstein. Challenges in web search engines [ J ]. SIGIR Forum, 2002,36 (2) :11 -22
  • 2Z. Gyongyi, H. Garcia- Molina, and J. Pedersen. Combating web spam with TrustRank [ C ]. In Proceedings of the 30th VLDB Conference,2004,9
  • 3PR10..info. BadRank as the opposite of PageRank, [ DB/ OL]. http ://en. pr10. info/pagerank0 - badrank/, 2006
  • 4D. Fetterly, M. Manasse, M. Najork. Spam, damn spam, and statistics: Using statistical analysis to locate spam web pages [ C ]. In Proceedings of the seventh workshop on the Web and databases ( WebDB), 2004,6 : 1 - 6
  • 5A. Benczur, K. Csalogany, T. Sarlos et al. Spamrank -fully automatic link spam detection [ C ]. In First International Workshop on Adversarial Information Retrieval on the Web, 2005
  • 6A. Ntoulas, M. Najork, M. Manasse, et al. Detecting spam web pages through content analysis[ C ]. In Proceedings of the 15th International Conference on the World Wide Web, Edinburgh, Scotland,2006,5
  • 7Zoltan Gyongyi, Pavel Berkhin, Hector Garcia - Molina et al. Link spam detection based on mass estimation [ C ]. In Proceedings of the 32nd International Conference on Very Large Data Bases( VLDB), 2006
  • 8B. Wu, V. Goel, and B. D. Davison. Topical TrustRank : Using topicality to combat web spam [ J ]. WWW ' 06(Edinburgh, Scotland), ACM Press, New York,2006, 5:63 - 72
  • 9A. Benczur, K. Csalogany, T. Sarlos et al. Spamrank - fully automatic link spam detection [ C ]. In First International Workshop on Adversarial Information Retrieval on the Web, 2005
  • 10L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank citation ranking: bringing order to the Web [ EB/OL]. Technical report, Stanford Digital Library Technologies Project, 1998

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部