期刊文献+

一种基于样本选择的安全半监督分类算法

A Safe Semi-supervised Classification Algorithm Based on Sample Selection
下载PDF
导出
摘要 为了进一步提高半监督分类器的安全性,提出一种基于样本选择的安全半监督分类算法S3C-SS(A safe semi-supervised classification algorithm based on sample selection)。首先,S3C-SS根据有标记样本的分布信息,对有标记样本进行筛选,删除有标记样本中的离群点。其次,S3C-SS对未标记样本进行筛选,选取未标记样本中不容易引起歧义的样本作为候选样本进行标记。接着,使用保留的有标记样本训练分类器,对候选的未标记样本进行标记,并使用新增的有标记样本扩充有标记集。该算法不断迭代,直到未标记样本集为空。最后,在UCI数据集上进行实验,对提出算法的有效性进行评估。结果表明S3C-SS能较好地提高数据的分类性能。 In order to further improve the safety of semi-supervised classifier,a safe semi-supervised classification algorithm based on sample selection(S3 C-SS) is proposed. First,S3 C-SS filters the labeled samples according to the distribution information of the labeled samples,and deletes the outliers in the labeled samples.Second,S3 C-SS selects unlabeled samples as candidate samples,which are not easy to cause ambiguity. Then,the reserved labeled samples are used to train the classifier to mark the candidate unlabeled samples,and the new labeled samples are used to expand the labeled set.The algorithm iterates until the unlabeled sample set is empty. Finally,experiments are carried out on UCI data set to evaluate the effectiveness of the proposed algorithm.The results show that S3 C-SS can improve the classification performance of data.
作者 赵建华 刘宁 ZHAO Jianhua;LIU Ning(School of Mathematics and Computer Application,Shangluo University,Shangluo 726000,China;School of Economics and Management,Shangluo University,Shangluo 726000,China)
出处 《系统仿真技术》 2020年第1期7-11,共5页 System Simulation Technology
基金 陕西省自然科学基础研究计划资助项目(2015JM6347) 商洛学院横向项目(2018HXKY056) 商洛学院科技创新团队建设项目(18SCX002) 商洛学院重点学科建设项目(学科名:数学)。
关键词 样本选择 半监督学习 安全性 分类 sample selection semi-supervised learning safety classification
  • 相关文献

参考文献6

二级参考文献83

  • 1苏金树,张博锋,徐昕.基于机器学习的文本分类技术研究进展[J].软件学报,2006,17(9):1848-1859. 被引量:387
  • 2李和平,胡占义,吴毅红,吴福朝.基于半监督学习的行为建模与异常检测[J].软件学报,2007,18(3):527-537. 被引量:30
  • 3郑海清,林琛,牛军钰.一种基于紧密度的半监督文本分类方法[J].中文信息学报,2007,21(3):54-60. 被引量:11
  • 4周志华.半监督学习中的协同训练风范[M]//机器学习及其应用.北京:清华大学出版社,2007:259-275.
  • 5Olivier C, Bernhard S, Alexander Z. Semi-Supervised Learning. Cambridge, USA : MIT Press, 2006 : 3 - 10.
  • 6Blum A, Mitchell T. Combining Labeled and Unlabeled Data with Co-Training//Proe of the 11th Annual Conference on Computational Learning Theory. Madison, USA, 1998 : 92 - 100.
  • 7Zhong Shi. Semi-Supervised Model-Based Document Clustering: A Comparative Study. Machine Learning, 2006, 65 ( 1 ) : 3 - 29.
  • 8Wagstaff K, Cardie C, Rogers S, et al. Constrained K-means Clustering with Background Knowledge // Proc of 18th International Conference on Machine Learning. San Francisco, USA, 2001:577 -584.
  • 9Wagstaff K, Cardie C. Clustering with Instance-Level Constraints// Proc of the 17th International Conference on Machine Learning. SanFrancisco, USA, 2000:1103 - 1110.
  • 10Huang Desheng, Pan Wei. Incorporating Biological Knowledge into Distance-Based Clustering Analysis of Micro Array Gene Expression Data. Bioinformatics, 2006, 22 (10) : 1259 - 1268.

共引文献91

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部