摘要
在研究贝叶斯过滤算法原理和实现方法的基础上,将垃圾邮件的先验概率由常数改进为实际概率,改进了token的选取范围和选取规则,在检测内容上增加url和图片。最后设计了一个基于改进贝叶斯过滤算法的垃圾邮件过滤器。实验结果表明,这种改进的贝叶斯过滤算法在垃圾邮件过滤中有良好的应用效果。
Based on the theory and practice of Bayesian filtering algorithms, a detailed process for improving the algorithm is put forward. Firstly, instead of a constant probability of spam, the actual prior probability is used. Secondly, the selective scope and rules of token are improved. Finally, url and images are added to the detected content. A spare filter based on the improved Bayesian filtering algorithm has been designed. Test results show that the improved Bayesian filtering algorithm works well in practice.
出处
《北京化工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2008年第6期93-97,共5页
Journal of Beijing University of Chemical Technology(Natural Science Edition)
基金
'十一五'国家科技支撑计划(2006BAK31B04)
关键词
垃圾邮件
改进贝叶斯过滤
内容检测
spare email
improved Bayesian filtering
content detection