期刊文献+

鞅在学习样本选择中的应用 被引量:1

Martingale Application in Selection of Learning Samples
下载PDF
导出
摘要 样本训练集的选取对网络分类精度及泛化能力有很大影响,同样对回归分析中的两难问题“偏差-方差”影响很大。经典的简单抽样理论在现实中很难做到,数据之间关系受到噪音以及领域知识的限制而显得很复杂,尤其是离群点的影响不能忽视。故而有限样本集中学习,如何获得最优结果不仅与算法有关,且与样本集的选取有关。文章首先从学习的数学理论出发阐明样本训练集的选取方法必要性,进而提出样本选择的鞅性要求与样本训练集中的离群点定义,最后提出在无监督学习中,混合密度分布有限样本集且样本类别数不知情形下的聚类与离群点判别算法,试验结果表明该算法的可行性与有效性。 The selection of training sample set has some influence on classification precision and generalization ability of neural networks as well as "bias-variance" dilemma of regression analysis.Classical simple sampling theory cannot carry out in reality because of noise affection and domain knowledge limitation,especially outliers affection,so that optimal result is relative to not only algorithms but also selection of sample set under the condition of finite samples.In this paper,the selection of training sample set is necessary in light of mathmatical learning theory firstly,martingale criterion about selecting samples and outliers definition are brought up secondly,and at last a kind of outliers detection algorithm is proposed based on unsupervised learning.The analysis of a simulated data shows that the algorithm can effectively detect samples produced by different mechanisms,namely outliers.
出处 《计算机工程与应用》 CSCD 北大核心 2006年第18期47-49,共3页 Computer Engineering and Applications
基金 安徽省高等学校青年教师科研资助计划资助项目(编号:2004jq103)
关键词 神经网络 回归分析 离群点 无监督学习 neural networks,regression analysis, martingale, outliers,unsupervised learning
  • 相关文献

参考文献8

  • 1Stewart Geman,Elie Bienenstock,René Doursat.Neural networks and the bias/variance dilemma[J].Neural Networks, 1992 ;4( 1 ) : 1-58
  • 2Ulf Grenander.On empirical spectral analysis of stochastic processes[J]. Ark iv Matematiki, 1951 ; 1 (35) :503-531
  • 3Thomas Heskes. Bias/variance decompositions for likelihood-based estimators[J].Neural Computatuion, 1998 ; 10(6) : 1425- 1433
  • 4Zhi-Hua Zhou,Jianxin Wu,Wei Tang.Ensembling Neural Networks: Many Could Be Better Than All[J].Artificial Intelligence ,2002; 1 (2) : 239-263
  • 5Hawkins D.Identification of Outliers[M].London:Chapman and Hall,1980
  • 6Keller A.Fuzzy clustering with outliers[C].In:Proc of the NAFIPS00, 2000:143-147
  • 7Hawkins D.Metal:Location of several Outliers in Multiple regression data using elemental sets[J].Technomerics, 1984 ;26: 197-208
  • 8王彤,何大卫.线性回归模型多个离群点的向前逐步诊断方法[J].数学的实践与认识,1999,29(4):69-76. 被引量:4

二级参考文献1

  • 1吕光群.贵池傩文化艺术,王兆乾代序[M].合肥:安微美术出版社,1998..

共引文献3

同被引文献6

  • 1Tom M Mitchell.曾华军 张银奎译.机器学习[M].北京:机械工业出版社,2003..
  • 2Vapnik.统计学习理论[M].张学工,译.北京:电子工业出版社,2004.
  • 3王珏,周志华,周傲英.机器学习及其应用[M].北京:清华大学出版社,2006.
  • 4Vapnik V N,Chervonenkis A.On the uniform convergence of relative frequencies of events to probabilities[J].Theory of Probability and its Application,1971,16(2):264-280.
  • 5Robert L Green,John H Kalivas.Graphical diagnostics for regression model determinations with consideration of the bias/variance trade-off[J].Chemometrics and Intelligent Laboratory Systems,2002,60(28):173-188.
  • 6李祚泳,彭荔红.BP网络学习能力与泛化能力满足的不确定关系式[J].中国科学(E辑),2003,33(10):887-895. 被引量:27

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部