摘要
为了提高垃圾邮件过滤系统的对邮件过滤的准确性和返回率,论文改进了传统的贝叶斯定理。提出一种改进的垃圾邮件过滤方法,该方法使用基于单词提取特征值和使用特征向量来描述频率。模型降低了垃圾邮件的错误率,总体上提高了系统的过滤性能。与传统贝叶斯公式的假设不同,系统为垃圾邮件样本的每个特征值分配不同的权值,降低了的垃圾邮件判断误差。实验结果表明,论文提出的垃圾邮件过滤方法能够显着提高准确性和返回率,系统性能得到了较大改进。
In order to improve the accuracy and return rate of the spam filtering system for mail filtering,the paper improves the traditional Bayes’ theorem. An improved spam filtering method is proposed,which uses word-based feature extraction and feature vectors to describe frequency. The model reduces the error rate of spam and improves the overall filtering performance of the system. Different from the assumption of the traditional Bayesian formula,the system assigns different weights to each feature value of the spam sample,which reduces the spam judgment error. Experimental results show that the spam filtering method proposed in this paper can significantly improve the accuracy and return rate,and the system performance has been greatly improved.
作者
袁连海
李湘文
徐晶
YUAN Lianhai;LI Xiangwen;XU Jing(Engineering&Technical College,Chengdu University of Technology,Leshan 614000)
出处
《计算机与数字工程》
2020年第3期513-516,712,共5页
Computer & Digital Engineering
基金
国家自然科学基金面上项目(编号:11375055)资助。
关键词
贝叶斯原理
邮件过滤
特征向量
Bayesian principle
spam filtering
feature vector