摘要
针对包含噪声与干扰数据的大规模机器学习问题,采用非凸Ramp损失函数抑制噪声和干扰数据的影响,提出一种基于随机优化的非凸线性支持向量机快速学习方法,有效改进训练速度和预测精度.实验结果表明该方法降低学习时间,在MNIST数据集上较传统学习方法的训练时间降低4个数量级.同时在一定程度上改进预测速度,并有效提升分类器对噪声数据集的泛化性能.
Aiming at large-scale machine learning problems with noise and interference data, the non-convex Ramp loss function is adopted to suppress the influences of noise and interference data, and a fast learning method is proposed for solving the non-convex linear support vector machines based on stochastic optimization. It effectively improves the training speed and the prediction accuracy. The experimental results manifest that the proposed method greatly reduces the learning time, and on the MNIST dataset the training time is reduced by 4 orders of magnitude compared to the traditional learning method. Meanwhile, it improves the prediction speed in a sense and greatly enhances the generalization performance of the classifiers for noisy dataset.
出处
《模式识别与人工智能》
EI
CSCD
北大核心
2013年第4期366-373,共8页
Pattern Recognition and Artificial Intelligence
关键词
大规模机器学习
支持向量机
Ramp损失
随机梯度下降
Large-Scale Machine Learning, Support Vector Machine, Ramp Loss, Stochastic GradientDescent