期刊文献+

基于贝叶斯方法和信息指纹的博客评论过滤 被引量:2

Blog's content filtering based on Bayes method and information fingerprint
下载PDF
导出
摘要 博客的出现丰富和改变了网络的内涵,影响了人们的信息传递方式,同时博客评论作为一种交互方式在博客中广泛存在,给信息监管带来了新的问题。通过分析现有的博客过滤系统,将广泛应用于文本过滤的贝叶斯方法应用到博客评论中,针对博客评论中广泛存在的广告机器人特点,结合信息指纹对其进行识别和过滤。同时对影响博客评论过滤效果和执行速度的指纹函数进行了分析讨论和实验对比,实验结果表明基于贝叶斯方法和信息指纹相结合的博客评论过滤是行之有效的,而且相对于单独的贝叶斯方法更有利于提高系统运行效率和发现广告机器人现象。 The appearance of blog enriches and changes the network's connotation, and influences the ways of informafion-delivering.Blog criticism,as an exchanging way,has been widely used in blog and thus brings new problems to information warding. This paper on one hand, applies Bayes of text filtering in blog criticism by analysis of blog filtering system in hand;On the other hand,because of the specific features of robot widely existing in blog criticism,this paper recognizes and fdters the criticism combining the information fingerprint.Moreover,this paper analyzes and discusses the fingerprint functions that influence blog-filtering's effect and carrying-out speed.The result of this experiment shows that this blog-filtering is effective, based on Bayes and informafion fingerprint,and is more advanced than the only Bayes in improving system running efficiency and finding out the phenomenon of advertisement robot.
出处 《计算机工程与应用》 CSCD 北大核心 2008年第24期159-161,180,共4页 Computer Engineering and Applications
关键词 博客 贝叶斯 评论 信息指纹 blog Bayes comments information fingerprint
  • 相关文献

参考文献12

  • 1我对博客说不[EB/OL].http://news.cctv.com/law/20060824/104515.shtml.
  • 2Sebastiani F.Machine learning in automated text categorization[J]. ACM Computing Surveys,2002,34.
  • 3Pantel P,Lin D.Spamcop-a spam classification & organization program[C]//Proceedings of AAAI-98 ,Workshop on Learning for Text Categorization, 1998.
  • 4刘静,尹存燕,陈家骏.一种规则和贝叶斯方法相结合的文本自动分类策略[J].计算机应用研究,2005,22(7):84-86. 被引量:7
  • 5吴军.数学之美系列十三信息指纹及其应用[EB/OL].[2006]:http://googlechinablog.com/2006/08/blog-post.html.
  • 6李晓明,凤旺森.两种对URL的散列效果很好的函数[J].软件学报,2004,15(2):179-184. 被引量:45
  • 7Partow A.General purpose hash function algorithms[EBlOL].http:// www.partow.net/programming/hashfunctions/.
  • 8Rabin hash function[EB/OL].http://jaist.dl.sourceforge.net/source- forge/rabinhash/rabin-hash-function-2.0.zip.
  • 9张华平.计算所汉语词法分析系统ICTCLAS[EB/OL].[2002-08-16].http://www.nip.org.cn/project/project.php?pwj_id=6.
  • 10Davis J,Goadrich M.The relationship between precision-recaU and ROC curves[C]//Ptoceedings of the 23rd international conference on Machine learning, 2006 : 233-235.

二级参考文献21

  • 1Cormen TH,Leiserson CE.Introduction to Algorithms.2nd ed.,Cambridge:MIT Press,2001.221-252.
  • 2Knuth DE.Sorting and Searching,Volume 3 of the Art of Computer Programming.New York:Addison-Wesley,1973.506-549.
  • 3McKenzie BJ,Harries R,Bell T.Selecting a hashing algorithm.Software Practice and Experience,1990,20(2):208-210.
  • 4Tong MCF.General hashing [Ph.D.Thesis].Computer Science Department,University of Auckland,1996.
  • 5Peter K.Pearson,fast hashing of variable length text strings.Communications of the ACM,1990,33(6):676-678.
  • 6Berners-Lee T.Universal resource locator.2003.http://www.w3.org/Addressing/URL/Overview.html
  • 7Yan HF,Wang JY,Li XM,Guo L.Architectural design and evaluation of an efficient Web-crawling system.Journal of System and Software,2002,60(3):185-193.
  • 8Shaffer CA.Zhang M,Liu XD,Trans.Data Structure and Algorithm Analysis.Beijing:Publishing House of Electronics Industry,1998.211-213(in Chinese).
  • 9ShafferCA 著 张铭 刘晓丹 译.数据结构与算法分析[M].北京:电子工业出版社,1998.211-213.
  • 10Apte, Damerau, Weiss. Automated Learning of Decision Rules for Text Categorization[J]. ACM Transactions on Information System, 1994,12(3) :233-251.

共引文献53

同被引文献35

引证文献2

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部