摘要
分析了硬间隔模糊粗糙支持向量机(FRSVMs)的优点与不足。FRSVMs通过修改硬间隔支持向量机(SVMs)的约束条件提高了泛化能力;FRSVMs虽然将训练样例的条件属性与决策属性之间的不一致性考虑在内,但是在寻找最优超平面时仍然要求将训练集完全正确地分开,因此对噪音具有敏感性。针对FRSVMs的这个缺点,提出了软间隔模糊粗糙支持向量机(C-FRSVMs)。它使用高斯核函数作为模糊相似关系,将数据集中样例的条件属性与决策标签之间的不一致程度考虑在内;在训练寻找最优超平面的过程中允许存在错分点,并对原始最优化问题中训练样例的错分程度进行惩罚;既考虑了间隔最大,又考虑了训练误差最小,从而降低了对噪音的敏感性。实验表明:针对一些数据集,无论其是否存在异常点,C-FRSVMs在测试精度上都可以同时优于硬间隔SVMs、软间隔支持向量机(C-SVMs)和FRSVMs,从而进一步提高了FRSVMs的泛化能力。
This paper analyzed the advantages and disadvantages of fuzzy rough set based support vector machines(FRSVMs).FRSVMs are generated by modifying constraints of hard margin support vector machines(SVMs) to get better generalization ability.Although having considered inconsistency between conditional attributes and decision attributes of training samples in datasets,FRSVMs construct the optimal hyperplane which must classify all the training samples correctly.So FRSVMs are sensitive to noises.Fuzzy rough set based soft margin support vector machines(C-FRSVMs) were proposed in this paper to overcome this shortcomings.C-FRSVMs use Gaussian kernel function as their fuzzy similarity relation,consider inconsistency between conditional attributes and decision labels of the samples in datasets,allow training samples to be misclassified during constructing the optimal hyperplane in the training process,punish the misclassification degrees of training samples in their original optimization problems.C-FRSVMs construct the optimal hyperplane by considering both maximal margin and minimal misclassification errors.So C-FRSVMs are less sensitive to noises than FRSVMs.Experimental results show that the proposed approach can obtain higher test accuracy compared with hard margin SVMs,soft margin support vector machines(C-SVMs) and FRSVMs.So,C-FRSVMs can get better generalization ability compared with FRSVMs.
出处
《计算机科学》
CSCD
北大核心
2011年第8期217-220,共4页
Computer Science
基金
国家自然科学基金资助项目(60903088
60903089)
河北省自然科学基金项目(F2010000323
F2011201063)资助