摘要
针对支持向量机(Support Vector Machine,SVM)处理大规模数据集常出现的训练速度慢、计算代价大以及实时性差等缺点,将基于密度的样本块划分法和基于欧式距离的边界样本筛选方法相结合,提出了一种新型的支持向量机约简方法。该方法首先进行空间块的划分,根据空间块的密度提取候选样本区域,并通过基于欧式距离改良的相对距离提取出大概率分布支持向量的边界样本。该方法既保证了训练样本的精度,又降低了计算代价,提高了泛化能力。工业应用结果表明了该方法不仅精度不低于SVM,并且计算速度远快于SVM。
Due to the drawbacks of slow training speed, large computational cost and poor real-time property when support vector machine deals with large scale data samples, a novel method was proposed for simplifying SVM, which combined the division method based on the sample density with the selection of boundary samples using Euclidean distance. Firstly, the proposed method divided the sample place into the sample blocks and selected the candidate space blocks by computing the blocks sample density. Secondly, it selected the boundary samples, which the support vectors were located along with high probability, using the relative distance based on the Euclidean distance. The industrial application results show that the algorithm is effective in reducing the training time and the computation and preserving machines' high accuracy.
出处
《系统仿真学报》
CAS
CSCD
北大核心
2012年第2期344-347,364,共5页
Journal of System Simulation
基金
国家863项目(2009AA04Z124)
国家自然科学基金(60874069)
湖南省自然科学基金(09JJ3122)