摘要
Spam页面可能极大地恶化诸如PageRank的基于排序的搜索算法。如何识别并抑制Spam页面已经成为一个很重要的问题。本文针对这个问题详细的分析了各种侦测和移除Spam页面的算法或方法,主要包括通用的方法、反面的方法和其它针对具体情形的方法三种类型。最后,文章对识别spam页面的关键技术及其前景进行了分析、展望。
Spam pages can greatly deteriorate link-based ranking algorithms such as PageRank. How to identify and neutralize spam pages has become a critical problem. In the paper, we survey all kinds of algorithms or methods for detecting and removing spam pages in detail, including the general methods, the opposite methods and other methods aiming at the special situations. Finally, some underlying technologies and prospects of identifying spam pages are discussed.
出处
《衡阳师范学院学报》
2007年第3期93-96,共4页
Journal of Hengyang Normal University
基金
衡阳师范学院科学基金启动项目(2005B11)