摘要
提出了改进的最小风险贝叶斯邮件过滤算法,将其应用于客户端中文邮件过滤,以适应邮件分类的个性化需求.实验结果表明,将改进算法应用于中文邮件过滤是可行的,且与传统贝叶斯算法相比,使垃圾邮件的误报率明显降低;实验测试了损失因子、特征数量对过滤效果的影响,得出了较优的参数设定,对中文邮件过滤提出了有益的思路.
The thesis proposes an improved least risk Bayesian algorithm,which is installed on the client for chinese e-mail filtering to adapt to individual requirements.The experiment result shows that the improved algorithm is feasible compared to the traditional Bayesian algorithm,and this method can greatly decrease the false positive rate of spam.The experiment tests the effects of the loss factor and the characteristic number on the result of the filtering,obtaining a set of preferable parameters and offering a beneficial way for chinese e-mail filtering.
出处
《兰州交通大学学报》
CAS
2010年第3期100-103,共4页
Journal of Lanzhou Jiaotong University
基金
甘肃省自然科学基金(096RJZA084)
甘肃省教育厅研究生导师科研计划项目(0814-4
0914-02)
关键词
贝叶斯算法
中文邮件过滤
特征数量
损失因子
Bayesian algorithm
chinese e-mail filtering
characteristic number
loss weight