摘要
对贝叶斯算法进行了深入分析与研究。在过滤算法设计中,研究发现基于贝叶斯算法的过滤模拟器运算的错误率与选取的敏感词汇数量有关,选取的敏感词汇与邮件训练集的数量越多,设计的邮件过滤器的正确率就越高。综合考虑了实用性和经济性,在选取训练集数量和敏感词汇数量时,根据实际情况选择了一个度,设计了一个基于贝叶斯算法的垃圾邮件模拟过滤模型。
This paper analyzes the Bayesian algorithms. It is found that the error rate of the emulator based on Bayesian filtering is related to the selected number of training sensitive words. The correct rate of the designed spam filter is higher with higher selected number and more train- ing sets. However, considering practicality and economy, we set a degree to select the number of training sets and sensitive words according to actual situation. Availability and economy are con- sidered, and a filter model based on Bayesian for the spam is designed.
出处
《上海电机学院学报》
2013年第4期224-228,共5页
Journal of Shanghai Dianji University
基金
上海市大学生创新计划项目资助(2012SCX15)
上海电机学院科研启动经费项目资助(13DX02)
上海电机学院重点学科资助(13XKJ01)