摘要
大部分偏标记学习方法假设所有训练样本都具有候选标记集,然而在许多现实场景下存在大量无标记样本.如何同时利用偏标记和无标记样本所隐含的信息构建学习模型,是偏标记半监督学习研究的关键问题.针对只含有少量标记样本、偏标记样本和大量无标记样本的图像分类问题,运用一致性正则化方法和伪标记方法建立深度学习模型.对于偏标记和无标记样本,基于其弱增强的输出结果生成伪标记,且偏标记样本的伪标记限制于其候选标记集中.研究设计了新的损失函数,包含3个损失项,可以同时利用数据中的监督信息、弱监督信息和无监督信息.为了提高参与训练过程样本的可靠性,只选择高置信度伪标记的样本来计算两种增强后的输出交叉熵损失.实验结果说明,该方法(CR-SSPL)比现有半监督学习SOTA方法FlexMatch和偏标记学习代表方法具有更高的精度和稳定性,收敛速度也有明显提升.
Most of partial label learning methods assume that all training samples have a set of candidate labels,but there are still a large number of unlabeled data in many real applications.How to construct a learning model by using both the information contained in partial and unlabeled samples is the crucial problem of partial semi-supervised learning.Aiming at image classification problem with only a small number of labeled and partially labeled samples and a large number of unlabeled data,this paper uses the consistency regularization and pseudo-labeled methods to develop the learning model.For partial labeled and unlabeled samples,the pseudo-labels were generated by the corresponding output distributions of their weak augmentations,and those of partial labeled samples were constrained in the candidate label sets.A new loss function including three items was designed,which can simultaneously use the supervised,weak supervised as well as unsupervised information contained in the data.The pseudo-labeled samples with high-confidence were selected to calculate the cross-entropy loss of their two kinds of augmentations to improve the sample reliability involved in the training process.The experiment results in this paper showed that showed that the proposed method(CR-PSSL)had higher accuracy and stability than the existing state-of-the-art semi-supervised learning method(FlexMatch)and representative partial label learning methods,and the convergence speed was also significantly improved.
作者
祝彪
李艳
王硕
ZHU Biao;LI Yan;WANG Shuo(College of Mathematics and Information Science,Hebei University,Baoding Hebei 071002,China;School of Applied Mathematics,Beijing Normal University at Zhuhai,Zhuhai Guangdong 519000,China)
出处
《西南大学学报(自然科学版)》
CAS
CSCD
北大核心
2024年第5期27-39,共13页
Journal of Southwest University(Natural Science Edition)
基金
国家自然科学基金项目(61976141)
河北省自然科学基金面上项目(F2021201055).
关键词
偏标记学习
半监督学习
一致性正则化
伪标记方法
图像分类
深度学习
partial label learning
semi-supervised learning
consistency regularization
pseudo labeling method
image classification
deep learning