摘要
为了进一步提高半监督分类器的安全性,提出一种基于样本选择的安全半监督分类算法S3C-SS(A safe semi-supervised classification algorithm based on sample selection)。首先,S3C-SS根据有标记样本的分布信息,对有标记样本进行筛选,删除有标记样本中的离群点。其次,S3C-SS对未标记样本进行筛选,选取未标记样本中不容易引起歧义的样本作为候选样本进行标记。接着,使用保留的有标记样本训练分类器,对候选的未标记样本进行标记,并使用新增的有标记样本扩充有标记集。该算法不断迭代,直到未标记样本集为空。最后,在UCI数据集上进行实验,对提出算法的有效性进行评估。结果表明S3C-SS能较好地提高数据的分类性能。
In order to further improve the safety of semi-supervised classifier,a safe semi-supervised classification algorithm based on sample selection(S3 C-SS) is proposed. First,S3 C-SS filters the labeled samples according to the distribution information of the labeled samples,and deletes the outliers in the labeled samples.Second,S3 C-SS selects unlabeled samples as candidate samples,which are not easy to cause ambiguity. Then,the reserved labeled samples are used to train the classifier to mark the candidate unlabeled samples,and the new labeled samples are used to expand the labeled set.The algorithm iterates until the unlabeled sample set is empty. Finally,experiments are carried out on UCI data set to evaluate the effectiveness of the proposed algorithm.The results show that S3 C-SS can improve the classification performance of data.
作者
赵建华
刘宁
ZHAO Jianhua;LIU Ning(School of Mathematics and Computer Application,Shangluo University,Shangluo 726000,China;School of Economics and Management,Shangluo University,Shangluo 726000,China)
出处
《系统仿真技术》
2020年第1期7-11,共5页
System Simulation Technology
基金
陕西省自然科学基础研究计划资助项目(2015JM6347)
商洛学院横向项目(2018HXKY056)
商洛学院科技创新团队建设项目(18SCX002)
商洛学院重点学科建设项目(学科名:数学)。
关键词
样本选择
半监督学习
安全性
分类
sample selection
semi-supervised learning
safety
classification