期刊文献+

基于判别模型的垃圾邮件过滤方法

Spam Filter Method Based on Discriminative Model
下载PDF
导出
摘要 垃圾邮件泛滥已成为网络时代的一个重要问题,随着垃圾邮件的伪装技术的不断更新,以前主要的几种垃圾邮过滤技术面临着新的挑战。文中提出一种新的基于判别模型的垃圾邮件过滤方法,邮件分类器通过不断的学习来更新特征项的权重,当新的信息到达时,计算所有特征项的权重之和,并将其转化为一个概率值,如果此概率值超过某一阈值时,就认定此信息为垃圾邮件;同时将此方法应用到实时邮件处理环境中。实验结果表明,此方法能明显地提高准确度,有效地降低误判率。 Spam e-mail is increasingly becoming a great problem in the Internet age.As the latest generation of spam incorporates sophisticated tactics,previous spam filtering technologies face a new challenge.Proposed a novel online spam filter based on discriminative model.Spam classifier updates the weights of features by continual learning.When a new message arrives,compute the sum of all weights and convert it to a probability.If that probability is over some threshold,predict that the message is spam,then applied the technique to online processing environment. Experimental results demonstrate that it can significantly raise the filtering accuracy,effectively reduce false positire.
出处 《计算机技术与发展》 2010年第1期181-184,共4页 Computer Technology and Development
基金 山东省自然科学基金(Q2006G03) 山东省科技攻关项目(2009GG10001008) 山东省软科学研究计划项目(2009RKA285)
关键词 互信息 判别模型 垃圾邮件过滤 梯度下降法 mutual information discriminative model spam filter gradient descent
  • 相关文献

参考文献7

  • 1中国互联网协会反垃圾邮件中心.年度反垃圾邮件报告[DB/OL].http://www.anti-spam.cn/,2007-04-05.
  • 2Hulten G, Goodman J. Tutorial on junk email filtering[R/ OL]. In ICML 2004: http://www, research, microsoft, corn/ -- joshuago/tutorialOnJ unkMailFilteringjune4, pdf.
  • 3张文良,黄亚楼,倪维健.基于差分贡献的垃圾邮件过滤特征选择方法[J].计算机工程,2007,33(8):80-82. 被引量:10
  • 4J, Yih Wen - tau. Online discriminative spare filter training[ C]//The Third Conference on Email and Anti- sparn(CEAS). California: [s. n. ] ,2006.
  • 5马莉,柴乔林.基于Postfix的垃圾邮件过滤技术的实现[J].计算机工程与设计,2005,26(4):999-1001. 被引量:5
  • 6Androutsopoulos I,Paliouras G,Karkaletsis V. Learning to fil- ter spare E - mail: A comparison of a naive bayesian and a memory- based approach [ C]//The Fourth Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD). France: [s. n, ] ,2000:1 - 13.
  • 7程卫华,尤晋元.基于内容过滤的反垃圾邮件系统的设计与实现[J].安徽大学学报(自然科学版),2007,31(3):30-33. 被引量:13

二级参考文献16

  • 1陈光英,孙东红.实现基于SpamAssassin的中文垃圾邮件过滤网关[J].中国教育网络,2005(11):46-47. 被引量:5
  • 2王斌,潘文锋.基于内容的垃圾邮件过滤技术综述[J].中文信息学报,2005,19(5):1-10. 被引量:129
  • 3林巧民,许建真,许棣华,王诚.基于贝叶斯算法的垃圾邮件过滤技术[J].南京师范大学学报(工程技术版),2005,5(4):61-64. 被引量:9
  • 4RichardBlum.开放源码邮件系统安全[M].人民邮电出版社,2002..
  • 5http://www.postfix.org/.
  • 6[EB/OL].http://www.postfix.org/,
  • 7Zhang Le, Zhu Jingbo, Yao Tianshun. An Evaluation of Statistical Spam Filtering Techniques[J]. ACM Transactions on Asian Language Information Processing, 2004, 3(4): 243-269.
  • 8Yang Aiming, Pedersen J O. A Comparative Study on Feature Selection in Text Categorization[C]//Proceedings of the 14^th International Conference on Machine Learning. 1997.
  • 9Church K W, Hanks P K. Words Association Norms, Mutual Information and Lexicography[C]//Proceedings of the 27^th Annual Meeting on Vancouver: Association for Computational Linguistics.1989: 76-83.
  • 10Lewis D D, Ringuette M. Comparison of Two Learning Algorithms for Text Categorization[C]//Proceedings of the 3^rd Annual Symposium on Document Analusic and Information Retrieval. 1994.

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部