摘要
给出了一种使用在线线性判别学习模型进行垃圾邮件过滤的方法,使用贝叶斯理论进行特征提取,特征按出现的位置进行分类,不同类别的特征赋予不同的权重.在TREC测试集上进行了实验,并和TREC评测的结果进行了对比.实验结果表明,该方法取得了较好的结果.
Spam filtering is an important task in the application of internet. In this paper a method of spam filtering based on online linear discriminative Learning Model is presented. We statically derive the features using Bayesian rule, clustering them into groups according to their position and then assigning weights respectively. The model is evaluated by TREC Spam corpus and compared with the TREC results. Experimental results show that our linear discriminative model can produce competitive results.
出处
《哈尔滨理工大学学报》
CAS
2008年第3期48-50,共3页
Journal of Harbin University of Science and Technology
关键词
垃圾邮件过滤
判别学习模型
特征提取
贝叶斯理论
主动学习
spam filtering
discriminative learning model
feature extraction
bayesian theory
active learning