期刊文献+

基于差异半监督学习的相关理论分析

Related Theoretical Analysis of Diversity-Based Semi-supervised Learning
下载PDF
导出
摘要 基于差异的半监督学习属于半监督学习和集成学习的结合,是近年来机器学习领域的研究热点.但相关的理论研究较缺乏,且都未考虑存在分布噪声的情况.文中首先针对基于差异的半监督学习的特点,定义一种分类噪声和分布噪声的混合噪声(HCAD).其次给出算法在HCAD噪声下的可能近似正确(PAC)理论分析及其应用实例.最后基于投票边缘函数,推导出在HCAD噪声下多分类器系统的泛化误差上界,并给出相关证明.文中开展的理论研究可用于设计基于差异的半监督学习算法及评估算法的泛化能力,具有广阔的应用前景. Diversity-based semi-supervised learning is the combination of semi-supervised learning and ensemble learning. It is a research focus in machine learning. However, its related theoretical analysis is insufficient, and the presence of distribution noise is not taken into account in these researches. In this paper, according to the characteristic of diversity-based semi-supervised learning, a hybrid classification and distribution (HCAD) noise is defined firstly. Then, probably approximately correct (PAC) analysis for diversity-based semi-supervised learning in the presence of HCAD noise and its application of the theorem are given. Finally, based on the voting margin, an upper bound is developed on the generalization error of multi-classifier systems with theoretic proofs in the presence of HCAD noise. The proposed theorems can be used to design diversity-based semi-supervised learning algorithms and evaluate their generalization ability, and they have a promising application prospect.
作者 姜震 詹永照
出处 《模式识别与人工智能》 EI CSCD 北大核心 2014年第10期865-872,共8页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金项目(No.61170126) 江苏大学高级人才启动基金项目(No.1291170022)资助
关键词 基于差异的半监督学习 噪声 可能近似正确(PAC)分析 泛化误差 Diversity-Based Semi-supervised Learning, Noise, Probably Approximately Correct(PAC) Analysis, Generalization Error
  • 相关文献

参考文献18

  • 1d'Alche Buc F, Grandvalet Y, Ambroise C. Semisupervised Marginboost // Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. Cambridge, USA: MIT Press, 2001, 14: 553-560.
  • 2Chen K, Wang S H. Semi-supervised Learning via Regularized Boo- sting Working on Multiple Semi-supervised Assumptions. IEEE Trans on Pattern Analysis and Machine Intelligence, 2011, 33(1): 129-143.
  • 3Chen K, Wang S H. Regularized Boost for Semi-supervised Learning[EB/OL]. [2013-12-25]. http://machinelearning.wustl.edu/mlpapers/paper_files/NIPS2007_164.pdf.
  • 4Blum A, Mitchell T. Combining Labeled and Unlabeled Data with Co-training // Proc of the 11th Annual Conference on Computational Learning Theory. Madison, USA, 1998: 92-100.
  • 5Jiang Z, Zhang S Y, Zeng J P. A Hybrid Generative/Discriminative Method for Semi-supervised Classification. Knowledge-Based Systems, 2013, 37: 137-145.
  • 6Brefeld U, Scheffer T. Co-EM Support Vector Learning // Proc of the 21st International Conference on Machine Learning. Banff, Ca-nada, 2004: 16-23.
  • 7Zhou Z H, Li M. Tri-training: Exploiting Unlabeled Data Using Three Classifiers. IEEE Trans on Knowledge and Data Engineering, 2005, 17(11): 1529-1541.
  • 8Dasgupta S, Littman M L, McAllester D. PAC Generalization Bou- nds for Co-training // Dietterich T G, Becker S, Ghahramani Z, eds. Advances in Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2001, 14: 375-382.
  • 9Wang W, Zhou Z H. Analyzing Co-training Style Algorithms // Proc of the 18th European Conference on Machine Learning. Warsaw, Poland, 2007: 454-465.
  • 10Wang W, Zhou Z H. A New Analysis of Co-training // Proc of the 27th International Conference on Machine Learning. Haifa, Israel, 2010: 1135-1142.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部