期刊文献+

一种基于多模态特征融合的垃圾邮件过滤方法 被引量:2

A Spam Filtering Method Based on Multi-modal Feature Fusion
下载PDF
导出
摘要 近年来,垃圾邮件制造者为了逃避基于文本的垃圾邮件过滤系统的检测,将垃圾信息嵌入到图像中,并将其附着在邮件正文中进行传播。传统的基于文本的过滤方式无法处理此类包含垃圾信息的邮件图像。为了应对这种同时包含文本和图像的垃圾邮件,本文提出了一种基于多模态特征的融合文本、图像等多媒体信息的过滤方法。首先通过抽取邮件的文本特征和图像特征构建多个分类器,然后采用多分类器融合技术对各分类器的输出结果进行综合。通过对TREC垃圾邮件语料集的测试实验表明,本文提出多模态特征融合的方法获得了比单个分类器更好的效果,准确率达到90%以上。 In recent years,for escaping the text-based spam filtering detection system,the spammers insert junk information into the image and attach it to the message body.In order to deal with such spam that contains text and images,a new filtering method is proposed,which fuses text,image and other multi-modal features by extracting the text features and image features to build multiple classifiers,and by employing multiple classifier fusion technology to integrate the output of each classifier.The experimental result on TREC dataset show that the fusion method achieves a better result than that of a single classifier and can achieve over 90% in accuracy rate.
出处 《北京电子科技学院学报》 2011年第2期46-57,共12页 Journal of Beijing Electronic Science And Technology Institute
基金 国家自然科学基金项目"基于多模态特征的多媒体语义分析关键理论与技术研究(No.60972139)" 北京市自然科学基金项目"基于网络多媒体信息语义的网络舆情分析研究"(No.4092041)"的资助
关键词 垃圾邮件过滤 多模态特征 多分类器融合 置信度 spam filtering multi-modal feature multiple classifier fusion degree of confidence
  • 相关文献

参考文献26

  • 1MessageLabs. MessageLabs Intelligence: 2010 Annual Security Report[R]. 2010.
  • 2中国互联网协会反垃圾邮件中心.2010年第三季度中国反垃圾邮件状况调查报告[R].2010.
  • 3Sahami, M., Horvitz, E., Sahami, M., et al. A Bayesian approach to filtering junk email, AAAI Workshop on Learning for Text Categorization, 1998.
  • 4Anayat, S., Ali, A., Ahmad, H. F. Using a probable weight based Bayesian approach for spam filtering[C]. In Proc of the 8th International Muhitopic Conference. 2004:340-345.
  • 5KIM H J., SHRESTHA J., KIM H N., et al. User action based adaptive learning with weighted Bayesian classification for filtering spam mail[C]. In Proc of Advances in Artificial Intelligence. Springer-Verlag, 2006:790-798.
  • 6YANG Zhen, NIE Xiang-fei, XU Wei-ran, et al. An approach to spare detection by naive Bayes ensemble based on decision induction[C]. In Proc of the 6th International Conference on Intelligent Systems Design and Applications. 2006:861-866.
  • 7SASAKI M., SHINNOU H. Spam detection using text clustering[C]. In Proc of International Conference on Cyberworlds. 2005.
  • 8HSIAO W F., CHANG T M., HUG H. A cluster2-based approach to filtering spare under skewed class distributions[C]. In Proc of the 40th Hawaii International Conference on System Sciences. 2007:53-60.
  • 9ZHANG Peng-fei, SU Yu-jie, WANG Cong. Statistical machine learning used in integrated anti- spam system[C]. In Proc of the 6th International Conference on Machine Learning and Cybernetics. 2007:4055--4058.
  • 10张秋余,张博,迟宁.自然语言语义理解在反垃圾邮件中的应用[J].计算机应用,2006,26(6):1315-1317. 被引量:5

二级参考文献7

共引文献129

同被引文献14

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部