摘要
在线学习自提出以来已经发展出许多算法及其改进,根据模型是线性还是非线性,将在线学习算法分为两大类:在线线性学习算法和基于核的在线学习算法。核方法是指利用特征映射,将样本映射到高维再生核希尔伯特空间,从而将线性不可分问题转化成线性可分问题。并行投影算法是用自适应投影次梯度方法(APSM)推导的,其优点是收敛速度快,但该算法的稀疏方式是当模型的规模达到一定容量时,可能将最远端某些重要的数据点删除,从而导致算法性能严重退化;以及存在偏移量使得分类效果变差。为了改进以上问题,本文是在并行投影算法的基础上提出了一种新的算法,其稀疏方式是利用最新的数据向字典中的数据做投影,从而在保证误分类率低时,本文算法的字典规模较小。仿真数据和真实数据的实验结果表明,与其他七种经典在线分类算法相比,本文算法在字典规模较小时,误分类率最低。
Online learning has developed many algorithms and their improvements since it was proposed. According to whether the model is linear or nonlinear, online learning algorithms are divided into two categories: online linear learning algorithms and kernel-based online learning algorithms. The kernel method is to map the sample to a high-dimensional regenerated kernel Hilbert space by us-ing feature mapping, so that the linear indivisible problem can be transformed into a linear separa-ble problem. The parallel projection algorithm is derived by the adaptive projection subgradient method (APSM), which has the advantage of fast convergence, but the sparse way of the algorithm is that when the scale of the model reaches a certain capacity, some important data points at the far-thest end may be deleted, resulting in serious degradation of the algorithm performance. And the existence of offset makes the classification effect worse. In order to improve the above problems, this paper proposes a new algorithm based on the parallel projection algorithm. Its sparse method is to use the latest data to project the data in the dictionary, so that the dictionary size of the algo-rithm in this paper is small when the classification error rate is low. The experimental results of simulation data and real data show that compared with other online algorithms, the proposed algo-rithm has the lowest classification error rate when the dictionary size is small.
出处
《应用数学进展》
2023年第9期4066-4075,共10页
Advances in Applied Mathematics