摘要
现实数据集通常是呈非线性分布的,虽然很多最小二乘支持向量机算法利用分治策略可以对这一类数据集进行建模,但是由于子模型缺乏鲁棒性,所建的总体模型易受噪声的干扰进而失效。为了对带有噪声的数据集建模,提出了一种基于聚类的鲁棒的最小二乘支持向量机。首先,使用聚类方法将样本分成几个子数据集,每一个子数据集对可以相应地建立一个局部的最小二乘支持向量机来获取对应子数据集的局部动态性。其次,通过在损失函数里加入一个全局正则化因子,使得局部子模型间能够智能地协调,保证建立的全局模型不仅是光滑连续的,同时具有良好的泛化性和鲁棒性。数学和实际例子表明,对于含有噪声的样本集,所提出的方法具有更好的建模效果。
Real datasets are often distributed nonlinearly. Although many least squares support vector machine (LS-SVM) methods have successfully modeled this kind of data using a divide-and-conquer strategy, they are often ineffective when nonlinear data are subject to noise due to a lack of robustness within each sub-model. In this paper, a robust clustered LS-SVM is proposed to model this type of data. First, the clustering method is used to divide the sample data into several sub-datasets. A local robust LS-SVM model is then developed to capture the local dynamics of the corresponding sub-dataset and to be robust to noise. Subsequently, a global regularization is constructed to intelligently coordinate all local models. These new features ensure that the global model is smooth and continuous and has a good generalization while maintaining robustness. Through the use of both artificial and real cases, the effectiveness of the proposed robust clustered LS-SVM is demonstrated.
出处
《井冈山大学学报(自然科学版)》
2017年第3期58-63,共6页
Journal of Jinggangshan University (Natural Science)
基金
中南大学硕士生自主探索创新项目(2016zzts306)