摘要
使用支持向量机求解大规模数据分类需要较大内存来存储Hessian矩阵,而矩阵的大小则依赖于样本数,因此在一定程度上导致支持向量机分类效率及质量难以提高。考虑到只有成为支持向量的样本才对决策函数起作用,为了减少训练样本时所需空间及时间开销,提高支持向量机分类效率与质量,提出了一种基于核函数的样本选取算法。该算法通过选取最大可能成为支持向量的样本,以达到减少训练时存储Hessian矩阵所需空间及时间开销的目的。实验结果表明,该算法所筛选出的样本不仅可以提高样本训练准确率,而且可以提高分类计算速度和减少存储空间开销。
Using support vector machines to solve large-scale data classification needs rather more memory to store Hessian matrix whose size depends on the size of sample,to some extent,it is difficult to improve the classification efficiency and quality of support vector machine.Taking into account that only a support vector of the sample should be play a role in decision-making function,in order to reduce the space and time required while training samples,improve the efficiency and quality of support vector machine classification,a support vector sample selection algorithm based on kernel function is presented.The most likely support vector is selected from samples to reduce training requirements for time and space of storage Hessian matrix.Experiments show the algorithm not only can improve the accuracy of the training,but also increase the computing speed and reduce storage space.
出处
《计算机工程与设计》
CSCD
北大核心
2010年第10期2266-2269,共4页
Computer Engineering and Design
基金
广东省科技计划基金项目(2009B010800036)
广东省教育科研基金项目(BKYBJG20060235)
关键词
支持向量
样本选取
核函数
结构风险
支持向量机
support vector
sample selection
kernel function
structural risk
support vector machine