期刊文献+

基于链接分析的网络搜索排名的反作弊研究 被引量:2

A Study on Anti-Cheating in Web Search Ranking Based on Link Analysis
下载PDF
导出
摘要 针对网络上大量充斥的搜索排名作弊行为,提出基于链接分析并具备反作弊功能的网络排名算法。在初始黑名单条件下,根据页面之间的链接关系,引入作弊倾向性和关联性2个概念,来衡量一个页面作弊的可能性。在此基础上,构造了惩罚因子,并对PageRank的值进行修正,实现新的排名顺序。该算法能够将权威性较高、作弊可能性较低的页面呈现给用户,提高用户的搜索效率。以3 537 379个网页8 456 740条链接为素材,对算法的反作弊性能进行实验。结果显示,与PageRank和TrustRank算法相比,该算法的反作弊性能有了明显地提高。 In the view of a great number of cheating technologies, we propose an anti-cheating sorting algorithm based on link analysis. Based on an initial blacklist which contains a small set of identifiedcheating pages, the penalty factor is created to evaluate a page from two aspects, namely fraud tendency and the authority. According to the penalty factor, we re-evaluate pages PageRank and sort pages by thesenew values. By using this algorithm, we can present pages with relatively high quality and low or even no cheating tendency to users, in which way users' searching efficiency is improved. In the experiments, wetested the anti-cheating performance of this algorithm based on 3537379 pages and 8456740 links. The result indicates that, compared with the PageRank and TrustRank algorithms respectively, the anticheating performance of our algorithm is considerably enhanced.
出处 《系统管理学报》 CSSCI 2013年第1期107-113,共7页 Journal of Systems & Management
基金 国家自然科学基金资助项目(70971099) 教育部人文社会科学资助项目(05JC870013) 上海市重点学科建设项目(B310)
关键词 排序算法 链接分析 作弊倾向 惩罚因子 反作弊 ranking algorithm link analysis cheating tendency penalty factor anti-cheating
  • 相关文献

参考文献20

  • 1Brin S, Page L. The pagerank citation ranking: Bring order to the web [R]. Technical Report, Stanford, 1999: 1-15.
  • 2Kleinberg J M. Authoritativesouces in a hyperlinked environment[J]. Journal of the Association for Computing Machinery, 1999, 46(5) : 604-632.
  • 3百度搜索帮助中心一网页搜索帮助一站长FAO[B/OL].www.baidu.corn/search/guide.html,2010-01-13.
  • 4Gyongyi Z, Garcia-Molina [C]//In: 1t International Information Retrieval on Chiba, Japan, 2005:1-8. H. Web spam taxonomy Workshop on Adversarial the Web (AIR Web),.
  • 5王灿辉,张敏,马少平.Web作弊与反作弊技术综述[C]//第二届信息检索与内容安全学术会议.北京,2005:279-284.
  • 6Caverlee J, Liu Ling, William B R. Link-based ranking of the web with source centric collaboration [C]//2006 International Conference on Collaborative Computing Networking, Applications and Worksharing,/Atlanta, 2006: 1-9.
  • 7Caverlee J, Webb S, Liu Ling, et al. A parameterized approach to spam-resilient link analysis of the web [J]. In Parallel and Distributed Systems, 2009, 20(10) :1422-1436.
  • 8Caverlee J, Liu Ling. Countering web spam with credibility-based link analysis [C]// In Proceeding of the 26th annual Association for Computing Machinery symposium on Principles of distributed computing, 2007:157-166.
  • 9Liang C, Ru L, Zhu X. R-spamrank: A detection algorithm based on link analysis spam [J].Journal of Computational In{ormation Systems, 2007, 3(4) : 1705-1712.
  • 10Wang Yong, Qin Zhiguang, Tong Bin, et al. Link farm spare detection based on its properties[C]// In International Conference Intelligence and Security, 2008 on Computational 477-480.

二级参考文献23

  • 1中国互联网络信息中心(CNNIC).2007.第19次中国互联网络发展状况统计报告[OL].http:/www.cnnic.cn/html/Dir/2007/01/22/4395.htm.
  • 2中国互联网络信息中心(CNNIC).2005.第16次中国互联网络发展状况统计报告[OL].http://www.china.org.cn/chinese/news/922344.htm.
  • 3Silverstein, C., Marais, H., Henzinger, M. et al. 1999. Analysis of a very large web search engine query log. [C]//Proceedings of the 22th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Berkeley, California, United States, August 15-19, 1999 ). SIGIR ' 99. ACM Press, New York, NY, 6-12.
  • 4Henzinger, M., Motwani, R., Silverstein. C. Challenges in Web Search Engines.[C]//Proceedings of the 25th Annual international ACM SIGIR Conference on Research and Development in information Retrieval (Tampere, Finland, August 11-15, 2002). SIGIR '02. ACM Press, New York, NY, 2002: 11-22.
  • 5Gyongyi, Z. and Garcia-Molina, H. 2005. Web spam taxonomy. [C]//First International Workshop on Adversarial Information Retrieval on the Web (Chiba, Japan, May 2005). AIRWeb '05.
  • 6Brin, S. and Page, L. The anatomy of a large-scale hypertextual Web search engine.[C]//Proceedings of the Seventh international Conference on World Wide Web 7 (Brisbane, Australia). 1998:107-117.
  • 7Kleinberg. J. M. 1999. Authoritative sources in a hyperlinked environment [J]. Journal of the ACM, 1999, 46(5): 604-632.
  • 8Wu, B. and Davison, B. Cloaking and redirection: a preliminary study. In First International Workshop on Adversarial Information Retrieval on the Web (Chiba, Japan, May 2005). [C]//AIRWeb '05. 2005.
  • 9Wang, Y., Ma, M., Niu, Y., and Chen, H. Spam double-funnel: Connecting web spammers with advertisers. [C]//Proc. of the 16th International Conference World Wide Web (Banff, Alberta, Canada. May 8 12, 2007). WWW '07. ACM Press, New York, NY, 2007: 291-300.
  • 10Fetterly, D., Manasse, M. and Najork, M. Spam, damn spare, and statistics: Using statistical analysis to locate spam web pages. [C]//Amer-Yahia S. and Gravano, L., eds. Proceedings of the 7th International Workshop on the Web and Databases (WebDB 2004). New York: ACMPress, 2004: 1-6.

共引文献8

同被引文献23

  • 1冉丽,何毅舟,许龙飞.基于Web结构挖掘的搜索引擎作弊检测方法[J].计算机应用,2004,24(10):158-160. 被引量:4
  • 2Gyongyi Z, Garcia-molina H. Combating Web Span with Trust Rank[ J]. In Very Large Data ses'04,2004 (30) :576-587.
  • 3WANG Wei, ZENG Guosun, TANG Daizhong. Using Evidence Based Content Trust Model for Span Detection[ J]. Expert Systans with Ap- plications, 2010,37 ( 8 ) : 5599-5606.
  • 4D. Cai, S. Yu,J-R. Wen and W-Y. Ma. Block-Based Web Search[J]. In Proc. of the ACM SIGIR Research and Development in Informa- tion Retrieval (SIGIR'04) ,2004, (5) :456-463.
  • 5Radlinski F,Dumais S.Improving personalized Web search using result diversification[C]//Proc of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.[S.l.]:ACM Press,2006:691-692.
  • 6Speretta M,Gauch S.Personalized search based on user search histories Web intelligence[C]//Proc of IEEE/WIC/ACM International Conference on Web Intelligence.2005:622-628.
  • 7Jeh G,Widom J.Scaling personalized Web search[C]//Proc of the 12th International Conference on World Wide Web.2013:271-279.
  • 8Stamou S,Ntoulas A.Search personalization through query and page topical analysis[J].User Modeling and User-adapted Interaction,2009,19(1-2):5-33.
  • 9Dou Zhicheng,Song Ruihua,Wen Jirong.A large-scale evaluation and analysis of personalized search strategies[C]//Proc of WWW.New York:ACM Press,2007:581-590.
  • 10Joachims T,Granka L,Pan B,et al.Accurately interpreting clickthrough data as implicit feedback[C]//Proc of SIGIR.2005:154-161.

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部