一种近邻局部学习的稳健性分析

Robustness Analysis of Local Learning Algorithm Based on Nearest Neighbor

导出

摘要在统计推断中,稳健性是指实际问题的数据来源与我们的模型假定有偏离时,所采用算法的结果受到的扰动很小,并且保持算法的预测性能.本文将统计稳健性的研究方法引入机器学习中,分析得到近邻估计这种局部学习能够在大样本的情形下收敛到 Bayes 最优估计,同时收敛条件可说明近邻估计是稳健估计.在模拟数据和真实数据库上进行实验,结果表明在某些离群点影响模型的情况下,仍保持监督学习预测的泛化性能. Robustness in statistical inference means that the departure of real data from an assumed sample distribution has little influence on the results of the remarkable prediction performance of the algorithm. The research methods of statistical robustness are introduced into machine learning in this paper. The nearest neighbor estimation algorithm, a kind of local learning, can converge to Bayes optimal estimation in the case of large number of samples, and meanwhile the nearest neighbor estimation algorithm is a kind of robust algorithm under the convergent condition. Finally, experimental results on synthetic and real datasets demonstrate that the generalization performance of the nearest neighbor estimation algorithm can be guaranteed when the model is affected by some outliers.

作者毕华王珏

机构地区中国科学院自动化研究所复杂系统与智能科学重点实验室

出处《模式识别与人工智能》 EI CSCD 北大核心 2008年第6期768-774,共7页 Pattern Recognition and Artificial Intelligence

基金国家重点基础研究发展规划项目(No.2004CB318103) 国家自然科学基金项目(No.60573078)资助

关键词局部学习稳健性噪音数据 Local Learning, Robustness, Noisy Data

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献13

1Henderson R. Note on Graduation by Adjusted Average. Transaction of the Actuarial Society of America, 1916, 17( 1 ) : 43 -48
2hoenberg I J. Contribution to the Problem of Approximation of E- quidistant Data by Analytic Function. Quarterly of Applied Mathematics, 1946, 4(1) : 45 -99,112 -141
3Breiman L. Statistical Modeling: The Two Cultures. Statistical Science, 2001, 16(3): 199-231
4Hoaglin D C, Mosteller F, Tukey J W. Understanding Robust and Exploratory Data Analysis. New York, USA: John Wiley, 1983
5Le Q, Bengio S. Noise Robust Discriminative Models [ EB/OL ]. [ 2003 - 09 - 01 ]. www. idiap, ch/fip/reports/2003/rr03 - 40. ps. gz
6Zhu Xingquan, Wu Xindong. Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts. Artificial Intelligence Review, 2004, 22(3) : 177 -210
7Jin Wen, Tung A K H, Han Jiawei. Minding Top-n Local Outliers in Large Database // Proc of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA, 2001:293-298
8Hautamaki V, Karkkainen I, Franti P. Outlier Detection Using k-Nearest Neighbor Graph//Proc of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004 : 430 - 433
9Vapnik V N. The Nature of Statistical Learning Theory. New York, USA: Springer, 1995
10Hastie T, Tibshirani R, Friedman J H F. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, USA: Springer, 2001

1张亮,赵娜.云计算平台资源的集成学习预测[J].信息技术,2016,40(9):188-190.
2余燕忠,王新伟.DCT域水印算法的稳健性分析[J].计算机工程,2003,29(3):81-83. 被引量：15
3贾沛璋.获得线性模型中稳健初始估计的新方法[J].控制理论与应用,1992,9(2):141-147. 被引量：1
4刘刚,许林峰.基于非降采样Contourlet变换的红外与可见光图像融合及稳健性分析[J].控制与决策,2010,25(4):623-626. 被引量：5
5王宪杰,谢文翘.具有有界扰动的奇异摄动系统的稳健性分析[J].哈尔滨电工学院学报,1990,13(4):401-408. 被引量：2
6周敏,李世玲,张富堂.基于均匀设计的线性回归模型稳健参数估计[J].信息与电子工程,2006,4(2):111-115. 被引量：2
7刘光远,邱玉辉,廖晓峰,覃朝玲.一种新颖的神经网络稳健估计方法[J].计算机研究与发展,1999,36(5):567-571. 被引量：1
8彭天好,范龙振.BP神经网络的一种稳健改进算法[J].计算机应用研究,1996,13(6):29-30. 被引量：1
9朱闻亚.结合稳健估计和Meanshift的视频目标跟踪算法[J].沈阳工业大学学报,2017,39(2):177-182. 被引量：11
10罗旭,程承旗,李勇,陈晓雪,冯仲科.SPSS在数据缺失值处理中的应用[J].水土保持研究,2007,14(4):426-429.

模式识别与人工智能

2008年第6期

浏览历史

内容加载中请稍等...

一种近邻局部学习的稳健性分析

参考文献13

相关作者

相关机构

相关主题

浏览历史