摘要
随机梯度下降(stochastic gradient descent,SGD)方法已被应用于大规模支持向量机(support vector machine,SVM)训练,其在训练时采取随机选点的方式,对于非均衡分类问题,导致多数类点被抽取到的概率要远远大于少数类点,造成了计算上的不平衡。为了处理大规模非均衡数据分类问题,提出了加权随机梯度下降的SVM在线算法,对于多数类中的样例被赋予较小的权值,而少数类中的样例被赋予较大的权值,然后利用加权随机梯度下降算法对SVM原问题进行求解,减少了超平面向少数类的偏移,较好地解决了大规模学习中非均衡数据的分类问题。
Stochastic gradient descent(SGD)has been applied to large scale support vector machine(SVM)training.Stochastic gradient descent takes a random way to select points during training process,this leads to a result that the probability of choosing majority class is far greater than that of choosing minority class for imbalanced classification problem.In order to deal with large scale imbalanced data classification problems,this paper proposes a method named weighted stochastic gradient descent algorithm for SVM.After the samples in the majority class are assigned a smaller weight while the samples in the minority class are assigned a larger weight,the weighted stochastic gradient descent algorithm will be used to solving the primal problem of SVM,which helps to reduce the hyperplane offset to the minority class,thus solves the large scale imbalanced data classification problems.
作者
鲁淑霞
周谧
金钊
LU Shuxia;ZHOU Mi;JIN Zhao(Hebei Province Key Laboratory of Machine Learning and Computational Intelligence, College of Mathematics and Information Science, Hebei University, Baoding, Hebei 071002, China)
出处
《计算机科学与探索》
CSCD
北大核心
2017年第10期1662-1671,共10页
Journal of Frontiers of Computer Science and Technology
基金
河北省自然科学基金No.F2015201185~~