摘要
针对传统垃圾邮件过滤问题中采用单一特征选择方法不能够有效提取训练集中全部重要特征或提取结果存在特征冗余的问题,提出一种基于多种特征选择方法融合的垃圾邮件过滤模型SF_FSF(Spam filtering based on feature selection fusion)。SF_FSF方法通过引入信息融合的概念,将特征选择看成一个决策问题,采用基于平均投票法的信息融合模型进行特征选择结果的融合,以提取垃圾邮件数据集中的重要特征,获得优秀的过滤能力。实验结果表明,SF_FSF方法比基于单一特征选择的垃圾邮件过滤方法得到了更好的过滤结果。
In this paper we present a spam filtering method SF_FSF,it is based on multi-feature selection methods fusion and in order to solve the problem of traditional spam filtering methods that they use single feature selection method so can not select all the important features in training set or the extracted results have feature redundancy. Based on introducing the concept of information fusion,SF_SFS deems feature selection as the decision making problem,and uses average voting method-based information fusion model to fuse feature selection results in order to extract important features of spam dataset and to obtain excellent filtering capability. Experimental results demonstrate that the KS_ FSF method can achieve better filtering results than the traditional single feature selection-based spam filtering methods.
出处
《计算机应用与软件》
CSCD
北大核心
2014年第4期31-34,共4页
Computer Applications and Software
关键词
垃圾邮件过滤
特征选择
信息融合
平均投票法
Spam filtering Feature selection Information fusion Average voting method