期刊文献+

基于多种核函数的SVM在垃圾邮件过滤中的应用 被引量:3

Application of various kernel function based SVM in spam filtering
下载PDF
导出
摘要 采用TF-IDF和Bernoulli两种模型构造邮件向量,首先详细测试了CHI降维策略对线性支持向量机进行邮件分类的影响。将基于核函数的支持向量机引入到垃圾邮件过滤中,对基于线性核、多项式核和径向基核的支持向量机在邮件分类中的准确率和训练时间进行了比较,分析了训练样本不平衡对分类的影响,并从理论上对实验结果进行了分析,实验结果证明基于径向基核函数的SVM分类器对垃圾邮件有较好的过滤效果。 The Support Vector Machine (SVM) based spam filter was summarized briefly. The mail vector was constructed on TF-IDF model and Bernoulli model, The effect to mail classification of CHI method to descend dimension was tested in detail. Kernel based SVM was introduced into spam filtering. The classification accuracy and training time of SVM based on linear kernel, polynomial kernel and radius basis function kernel were compared and analyzed, It was proposed and analyzed that the imbalance of training samples has great affect on the classification accuracy and the false positive ratio.
出处 《计算机应用》 CSCD 北大核心 2008年第2期424-427,共4页 journal of Computer Applications
基金 国家863计划项目(2002AA415270)
关键词 支持向量机 垃圾邮件过滤 核函数 特征选择 Support Vector Machine(SVM) spam filtering kernel function feature selection
  • 相关文献

参考文献5

  • 1DRUCKER H,WU D,VAPNIK V.Support vector machines for spam categorization[J].IEEE Transactions on Neural Networks,1999,20(5):1048-1054.
  • 2ANDROUTSOPOULS I,PALIOURAS G,MICKELAKIS E.Learning to filter unsolicited commercial e-mail[EB/OL].[2006-12-20].http://www.aueb.gr/users/ion/docs/ TR2004_updated.pdf.
  • 3KOLCZ A,ALSPECTOR J.SVM-based filtering of e-mail spamwith content-specific misclassification costs[C]// ICDM-2001 Workshop on Text Mining.CA:IEEE,2001.
  • 4VAPNIK V.The nature of statistical learning theory[M].New York:Springer-Verlag,1995.
  • 5YANG Y,PEDERSEN J O.A comparative study on feature selection in text categorization[C]// International Conference on Machine Learning.San Francisco:Morgan Kaufmann Publishers.1997:412-420.

同被引文献30

  • 1袁小芳,王耀南.基于混沌优化算法的支持向量机参数选取方法[J].控制与决策,2006,21(1):111-113. 被引量:55
  • 2李文斌,刘椿年,陈嶷瑛.基于混合高斯模型的电子邮件多过滤器融合方法[J].电子学报,2006,34(2):247-251. 被引量:12
  • 3王天佐,胡华平,刘波,等.反垃圾邮件技术研究[C] //中国电子学会第十五届信息论学术年会暨第一届全国网络编码学术年会.青岛,2008.
  • 4Kim Byung-joo,Kim Il-kon.Kernel based intrusion detection system[C] //Proceedings of the Fourth Annual ACIS International Conference on Computer and Information Science.IEEE Computrer Society,2005:13-18.
  • 5中国互联网协会反垃圾邮件中心.2008年第三季度中国反垃圾邮件调查报告[R/OL].北京:中国互联网协会反垃圾邮件中心,2008.http://www.anti-spam.cn/ShowArticle.php?id=9465.
  • 6Metsis V, Androutsopoulos I, Paliouras G.Spam filtering with Naive Bayes-which Naive Bayes?[C]//CEAS 2006 Third Conference on Email and AntiSpam, CEAS 2006,Mountain View, California USA,July 27-28,2006.
  • 7Sculley D,Wachman G.Relaxed online SVMs for spare filtering[C]// The Thirtieth Annual ACM SIGIR Conference Proceedings,Amsterdam, the Netherlands, July 23-27,2007.
  • 8Scholkopf B, Smola A J.Leaming with Kernels: Support vector machines, regularization, optimization, and beyond[M].Cambridge, MA:MIT Press,2002.
  • 9Blanzieri E, Bryl A.Instance-based spare filtering using SVM nearest neighbor classifier[C]//Proceedings of FLAIRS-20, Key West,Florida,USA,May 7-9,2007.
  • 10Witten I H, Frank E.Data mining pactical machine learning tools and techniques[M].2nd ed.San Francisco, CA:Morgan Kaufmann Publishers, 2005: 88-97.

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部