摘要
针对高维大样本环境下支持向量机训练算法面临的耗时增大与维数灾问题,将序列最小优化算法(SMO)与粗糙集(RS)的数据处理功能相结合,提出一种新的基于粗糙集与支持向量机的分类算法RS-SMO.该算法依据属性的重要性对数据集作属性约简,用粗糙边界集法生成类边界集作为SMO的训练子集,使训练集比原始训练集的维数与规模都有一定程度的减少,可构造出具有较好时空性能的算法.实验结果表明,RS-SMO算法能实现结构风险最小化,且性能优于SMO算法.
When training the high-dimension and large-sample objectives,the support vector machine(SVM) may encounter the curse of dimensionality and may result in large time cost.In order to solve these problems,this paper presents a novel classification algorithm based on rough set and support vector machine(RS-SMO) by combining the sequence minimizing optimization(SMO) algorithm with the data processing function of a rough set.In this algorithm,data sets are attribute-reduced according to the attribute significance,and some class boundary sets are formed by using rough boundary set as the training subsets of SMO algorithm.Thus,the dimension and scale of the training set become less than both of the original sets,which helps to improve the time-space performance of the algorithm.Experimental results indicate that the proposed RS-SMO algorithm minimizes the structural risk and is superior to the SMO algorithm in its performance.
出处
《华南理工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2008年第5期123-127,共5页
Journal of South China University of Technology(Natural Science Edition)
基金
国家自然科学基金资助项目(30570458)
关键词
粗糙集
支持向量机
分解算法
属性约简
边界集
时空性能
rough set
support vector machine
decomposing algorithm
attribute reduction
boundary set
time-space performance