摘要
在实际的邮件过滤应用中,由于垃圾邮件本身的一些因素,像传统的支持向量机分类模型把一个邮件样本明确地归为某一类就很容易出错,而以一定概率的输出判断是否属于某一类则较为合理。根据这种思想,本文在传统支持向量机邮件分类器基础上,提出了一种分类器优化思想,通过对分类输出进行概率计算,并对概率的阈值进行判断,从而确定邮件所属类别。实验证明这种方法是有效可行的。
In the real spare-filtering environment, because of the complicated factor of spam itself. It's easy to make mistakes just as the traditional support vector machine classifiers model doing-assigning an e-mail example to a class specifically. However, assigning an e-mail example to a class according to its probability output is a reasonable solution to the problem. According to the theory, we put forward an optimising idea based on the traditional SVM classification model. By computing the probability of output class, and judging the threshold of the probability, we can make sure which class the input email example belongs to. The experiment has proved that this method is efficient and feasible.
出处
《计算机科学》
CSCD
北大核心
2007年第9期90-92,共3页
Computer Science
基金
重庆市科委自然科学基金(基金号:CSTC2006BB2021)的资助