摘要
以降级WebSpam算法为研究内容,在分析TrustRank等算法的优点和不足的基础上,提出了时间可信度的概念刻画不同时间下页面的可信度,引入了CreditRank算法来计算页面的可信度。同时,引入了LinkRank算法来计算基于链接的页面质量,整合权威度和可信度。定性分析和实验结果表明,CreditRank算法扩展了种子的利用范围,提高了反Spam算法的覆盖度,LinkRank算法能够解决"无辜页面"问题,并取得很好的降级Spam的效果。
The research is focus on spam-degrade algorithms. TmstRank algorithm's advantage and disadvantage are analyzed, and a new TimeCredit theory is proposed to describe a page's credit value at different times, the CreditRank algorithm is raised to calculate the credit value, a new link based analysis method named LinkRank is also put forward to integrate pages' authority and credence. Analysis and experiment show the fact that the CreditRank expands the seeds' valid area and improves anti-spare algorithm's recall. The experiment also shows that the LinkRank can avoid creating innocent pages, and is highly successful in degrading web spam.
出处
《计算机工程与设计》
CSCD
北大核心
2009年第10期2350-2353,共4页
Computer Engineering and Design