期刊文献+

基于自适应剪辑与概率参数的Tri-Training算法 被引量:1

The ADP-Tri-Training:Tri-Training with Adaptive Editing and Probability Parameters
下载PDF
导出
摘要 半监督学习利用少量标签数据和大量的无标签数据进行学习.Tri-training是一种基于分歧的半监督分类算法,在进行伪标记时会因误标记而使训练集产生噪声,从而导致算法分类性能下降.为了减少误标记对算法分类性能的影响,该文提出一种基于自适应剪辑与概率参数的Tri-training算法(ADPT).新算法利用基于最近邻的RemoveOnly数据剪辑技术对触发自适应剪辑策略的标记数据进行噪声识别及剔除,而未触发自适应剪辑策略的标记数据则用概率参数方法对噪声进行识别及剔除.为验证本文算法的分类性能,采用4个评价指标,在9组UCI数据集上进行实验,并与相关算法进行比较.实验结果表明:该算法在准确率、精度、召回率及F_(measure)等评价指标上与其他算法相比,具有明显优势. Semi-supervised learning utilizes a small amount of labeled data and a large amount of unlabeled data for learning.Tri-training is a divergence-based semi-supervised classification algorithm.When pseudo-labeling,Tri-training will cause noise in the training set due to mislabeling,which will reduce the classification performance of the algorithm.In order to reduce the impact of mislabeling on the classification performance of the algorithm,the ADP-Tri-training that is tri-training with adaptive editing and probability parameters(ADPT) is proposed.Firstly the new algorithm uses the nearest neighbor-based RemoveOnly data editing technology to identify and eliminate the noise of the marked data that triggers the adaptive editing strategy,while the marked data that does not trigger the adaptive editing strategy uses the probability parameter method to identify and eliminate noise.In order to verify the classification performance of the algorithm in this paper,four evaluation indicators are used to conduct experiments on 9 groups of UCI datasets,and compare with related algorithms.The experimental results show that the algorithm in this paper has obvious advantages compared with other algorithms in terms of accuracy,precision,recall and F_(measure) indicators.
作者 李松 吴润秀 康平 赵嘉 LI Song;WU Runxiu;KANG Ping;ZHAO Jia(School of Information Engineering,Nanchang Institute of Technology,Nanchang Jiangxi 330099,China)
出处 《江西师范大学学报(自然科学版)》 CAS 北大核心 2023年第5期490-496,共7页 Journal of Jiangxi Normal University(Natural Science Edition)
基金 国家自然科学基金(52069014) 江西省教育厅科技计划课题(GJJ180940,GJJ201915)资助项目。
关键词 半监督学习 自适应策略 概率参数 三体训练算法 semi-supervised learning adaptive strategy probability parameter Tri-training
  • 相关文献

参考文献6

二级参考文献105

  • 1Chapelle O,Schoelkopf B,Zien A.Semi-Supervised Learning.Cambridge:MIT Press,2006
  • 2Nigam K,McCallum A K,Thrun S,Mitchell T.Text classification from labeled and unlabeled documents using EM.Machine Learning,2000,39(2-3):103-134
  • 3Miller D J,Browning J.A mixture model and EM-based algorithm for class discovery,robust classification,and outlier rejection in mixed labeled/unlabeled data sets.IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(11):1468-1483
  • 4Joachims T.Transductive inference for text classification using support vector machines//Proceedings of the 16th International Conference on Machine Learning.New York,USA,1999:200-209
  • 5Blum A,Lafferty J,Rwebangira M,Reddy R.Semi-super-vised learning using randomized mincuts//Proceedings of the 21st International Conference on Machine Learning.Texas,USA,2004:934-947
  • 6Zhu X J.Semi-supervised learning literature survey.University of Wisconsin,Wisconsin:Technical Report:TR1530,2006
  • 7Blum A,Mitchell T.Combining labeled and unlabeled data with co-training//Proceedings of the 11th annual conference on Computational Learning Theory.Wisconsin,USA,1998:92-100
  • 8Goldman S,Zhou Y.Enhancing supervised learning with unlabeled data//Proceedings of the 17th International Conference on Machine Learning.California,USA,2000:327-334
  • 9Zhou Z H,Li M.Tri-training:Exploiting unlabeled data using three classifiers.IEEE Transactions on Knowledge and Data Engineering,2005,17(11):1529-1541
  • 10Blum A,Chawla S.Learning from labeled and unlabeled data using graph mincuts//Prnceedings of the 18th International Conference on Machine Learning.Williamstown,MA,2001:19-26

共引文献151

同被引文献15

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部