期刊文献+

一种近邻局部学习的稳健性分析

Robustness Analysis of Local Learning Algorithm Based on Nearest Neighbor
原文传递
导出
摘要 在统计推断中,稳健性是指实际问题的数据来源与我们的模型假定有偏离时,所采用算法的结果受到的扰动很小,并且保持算法的预测性能.本文将统计稳健性的研究方法引入机器学习中,分析得到近邻估计这种局部学习能够在大样本的情形下收敛到 Bayes 最优估计,同时收敛条件可说明近邻估计是稳健估计.在模拟数据和真实数据库上进行实验,结果表明在某些离群点影响模型的情况下,仍保持监督学习预测的泛化性能. Robustness in statistical inference means that the departure of real data from an assumed sample distribution has little influence on the results of the remarkable prediction performance of the algorithm. The research methods of statistical robustness are introduced into machine learning in this paper. The nearest neighbor estimation algorithm, a kind of local learning, can converge to Bayes optimal estimation in the case of large number of samples, and meanwhile the nearest neighbor estimation algorithm is a kind of robust algorithm under the convergent condition. Finally, experimental results on synthetic and real datasets demonstrate that the generalization performance of the nearest neighbor estimation algorithm can be guaranteed when the model is affected by some outliers.
作者 毕华 王珏
出处 《模式识别与人工智能》 EI CSCD 北大核心 2008年第6期768-774,共7页 Pattern Recognition and Artificial Intelligence
基金 国家重点基础研究发展规划项目(No.2004CB318103) 国家自然科学基金项目(No.60573078)资助
关键词 局部学习 稳健性 噪音数据 Local Learning, Robustness, Noisy Data
  • 相关文献

参考文献13

  • 1Henderson R. Note on Graduation by Adjusted Average. Transaction of the Actuarial Society of America, 1916, 17( 1 ) : 43 -48
  • 2hoenberg I J. Contribution to the Problem of Approximation of E- quidistant Data by Analytic Function. Quarterly of Applied Mathematics, 1946, 4(1) : 45 -99,112 -141
  • 3Breiman L. Statistical Modeling: The Two Cultures. Statistical Science, 2001, 16(3): 199-231
  • 4Hoaglin D C, Mosteller F, Tukey J W. Understanding Robust and Exploratory Data Analysis. New York, USA: John Wiley, 1983
  • 5Le Q, Bengio S. Noise Robust Discriminative Models [ EB/OL ]. [ 2003 - 09 - 01 ]. www. idiap, ch/fip/reports/2003/rr03 - 40. ps. gz
  • 6Zhu Xingquan, Wu Xindong. Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts. Artificial Intelligence Review, 2004, 22(3) : 177 -210
  • 7Jin Wen, Tung A K H, Han Jiawei. Minding Top-n Local Outliers in Large Database // Proc of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA, 2001:293-298
  • 8Hautamaki V, Karkkainen I, Franti P. Outlier Detection Using k-Nearest Neighbor Graph//Proc of the 17th International Conference on Pattern Recognition. Cambridge, UK, 2004 : 430 - 433
  • 9Vapnik V N. The Nature of Statistical Learning Theory. New York, USA: Springer, 1995
  • 10Hastie T, Tibshirani R, Friedman J H F. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York, USA: Springer, 2001

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部