期刊文献+

一种有效的垃圾邮件过滤新方法 被引量:4

New effective method for spam filtering
下载PDF
导出
摘要 受到信息粒度原理的启发,给出了一种有效的垃圾邮件过滤新方法。该方法训练过程是将训练样本集合中合法邮件类和垃圾邮件类拆分成四个小类,得到四个小类的类中心向量,从粒度原理角度来看,就是采用更细的粒度来描述训练样本的先验知识。过滤过程则将新进来的邮件分别与四个小类的类中心向量进行相似度比较,最终来判定所属类别。在公共垃圾邮件语料库上测试新方法,同时与目前过滤性能较高的KNN方法进行比较,结果显示新方法具有过滤精度高,过滤速度快等优点。 A new effective method for spam filtering according to the principle of granularity was presented. First, this method divided spam class and legit class in train corpus into four small classes, and four center vectors were obtained. In the view of the principle of granularity, smaller granularity is used to describe knowledge in train corpus. When faltering, the new E-mail was compared with four center vectors respectively to decide which class it belonged to. This method was tested on spain corpus and compared with KNN. The results show that the new method has some advantages including high accuracy, high speed of filtering and so on.
作者 林琛 李弼程
出处 《计算机应用》 CSCD 北大核心 2006年第8期1980-1982,共3页 journal of Computer Applications
基金 河南省教育厅基金资助项目(sp200303099)
关键词 垃圾邮件过滤 粒度 KNN spam filtering granularity KNN
  • 相关文献

参考文献9

  • 1ANDROUTSOPOULOS I,PALIOURAS G,KARKALETSIS V.Learning to filter spam E-mail:A comparison of a naive bayesian and a memory-based approach[A].Proceedings of the workshop:Machine Learning and Textual Information Access[C].2000.1-13.
  • 2SAHAMI M,DUMAIS S,HECKEMAN D,et al.A bayesian approach to filtering junk E-mail[A].Learning for Text Categorization-Papers from the AAAI Workshop[C].1998.56-62.
  • 3COHEN WW.Learning rules that classify e-mail[A].Proceedings of AAAI Spring Symposium on Machine Learning in Information Access[C].1996.18 -25.
  • 4潘文峰.基于内容的垃圾邮件过滤研究[D].中国科学院计算技术研究所硕士毕业论文,2004.
  • 5SAKKIS G,ANDROUTSDOPOULOS I,PALIOURAS G,et al.A memory-based approach to anti-spam filtering for mailing list[J].Kluwer Academic Publishers,Information Retrieval,2003,6(1):49-73.
  • 6DRUCKER H,WU D,VAPNIK VN.Support vector machines for spam categorization[J].IEEI Transactions on Neural Networks,1999,20(5):1048-1054.
  • 7CARRERAS X,MARQUEZ L.Boosting trees for anti-spam E-mail filtering[A].Proceedings of 4th Int'l Conference on Recent Advances in Natural Language Processing[C].2001.58 -64.
  • 8刘洋 杜孝平 罗平 侯志辉 郭晨 骆焕林.垃圾邮件的智能分析、过滤及Rough集讨论[A]..第十二届中国计算机学会网络与数据通信学术会议[C].武汉,2002年12月..
  • 9NICHOLAS T.Using AdaBoost and Decision Stumps to Identify Spam E-mail[EB/OL].Stanford University Course Project (Spring 2002/2003) Report,http:// nlp.stanford.edu/courses/cs224n/2003.fp,2003.

共引文献3

同被引文献33

引证文献4

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部