期刊文献+

基于密度峰值聚类和相对距离的半监督自训练方法

Semi-supervised Self-training Method Based on Density Peaks Clustering and Relative Distance
下载PDF
导出
摘要 半监督自训练方法属于半监督自标记方法的一种,它能同时利用有标记样本和无标记样本来训练分类器。然而,对半监督自训练方法而言,误标记是一个不容忽视的问题。为此,文章提出了一种基于密度峰值聚类和相对距离的半监督自训练方法(STDPRD)。在迭代的自训练过程中,STDPRD首先用密度峰值聚类来选取具有高置信度的无标记样本,再标记他们;其次,STDPRD用相对距离来过滤掉在迭代过程中被误标记的样本;然后,STDPRD把在迭代过程中被正确标记的样本加入有标记集中;最后,STDPRD用被扩充的有标记集来训练给定的分类器,训练完成后,输出被训练的分类器。仿真实验结果表明,在真实数据集上,STDPRD的表现优于4种流行的半监督自训练方法。 The semi-supervised self-training method is a kind of semi-supervised self-labeling method,which can train the classifier with labeled samples and unlabeled samples at the same time.However,for semi-supervised self-training methods,mislabeling is a problem that cannot be ignored.To this end,this paper proposes a semi-supervised self-training method based on density peak clustering and relative distance(STDPRD).In the iterative self-training process,STDPRD first uses density peak clustering to select unlabeled samples with high confidence,and then labels them.Second,STDPRD uses relative distance to filter out samples that are mislabeled during iteration.STDPRD then adds the samples correctly labeled during the iteration to the labeled set.Finally,STDPRD trains a given classifier with an extended labeled set,and outputs the trained classifier after the training is completed.Simulation results show that STDPRD performs better than 4 popular semi-supervised self-training methods on real data sets.
作者 孙洁 景志敏 周欢 Sun Jie;Jing Zhimin;Zhou Huan(School of Intelligent Equipment,Chongqing Vocational College of Public Transportation,Chongqing 402247,China;School of Automotive Engineering,Chongqing Energy College,Chongqing 402260,China)
出处 《统计与决策》 北大核心 2024年第17期53-58,共6页 Statistics & Decision
关键词 半监督学习 半监督分类 相对距离 误标记 semi-supervised learning semi-supervised classification relative distance mislabeling
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部