摘要
由于支持向量机完整的理论框架和在实际应用中取得的好效果,在机器学习领域受到了广泛的重视。但是支持向量机算法最大的缺点就是在处理大规模训练数据集时需要巨大的内存和很长的训练时间。在这样的背景下,提出了使用并行化技术训练支持向量机。其基本思想是把大的数据集分解成小的子集,每个子集分别用于训练一个支持向量机,然后将多个训练结果有效融合。在现有技术的基础上,提出改进方案,在保证正确分类的情况下使用并行化技术来提高支持向量机的训练速度。实验结果表明,新方案在保证分类精度基本不变的情况下,可以有效减少支持向量机的训练时间。
Support Vector Machine has been widely appreciated in the field of machine learning because of its complete theoretical framework and good results in practical application.However,a drawback of support vector machine is that it will requires a huge memory and very long times when dealing with large-scale data sets.In such background,in this paper,a parallel technology to train support vector machines is proposed.The basic idea is to split large data sets into small sub-sets,each of which were used to train a support vector machine,and then merge the results effectively.On the basis of existing techniques and ensuring classification correct,a parallel technique is used to improve the training speed of support vector machines.Experiment results shows that the new technique can reduce the training time obviously and remain similar classification precision.
出处
《微处理机》
2010年第5期42-45,共4页
Microprocessors
关键词
支持向量机
分类
并行算法
Support Vector Machines
Classification
Parallel Algorithm