一种基于Tri-training的众包标记噪声纠正算法被引量：1

A Tri-training-Based Label Noise Correction Algorithm for Crowdsourcing

下载PDF

导出

摘要在众包学习中,使用标记集成算法得到的集成标记中仍然存在一定程度的标记噪声.本文受三重训练思想的启发,提出了一种基于tri-training的众包标记噪声纠正算法(Tri-Training-based Label Noise Correction,TTLNC).TTLNC首先使用过滤器获得干净集和噪声集,然后在干净集上进行bagging分别训练三个不同的分类器,并通过这些分类器重新标注噪声集中的实例,同时按照实例分配策略将实例分配给相应的训练集.最后在新训练集上重新训练三个不同的分类器,并用新分类器的分类结果重新标注所有实例.在仿真标准数据和真实众包数据集上的实验结果表明TTLNC比其他四种最先进的噪声纠正算法在噪声比和模型质量两个度量指标上表现更优. In crowdsourcing learning,a certain level of label noise still exists in integrated labels obtained by employing ground truth inference algorithms.Inspired by the tri-training idea,this paper proposes a tri-training-based label noise correction(TTLNC)algorithm for crowdsourcing.TTLNC at first employs a filter to get a clean set and a noisy set and then trains three different classifiers from the bagged clean set.Furthermore,each instance from the noisy set is relabeled by these classifiers and assigned to the corresponding training set according to the designed instance assignment strategy.Finally,three classifiers are retrained on three new training sets and are used to relabel all instances.Experimental results on both simulated benchmark data and real-world crowdsourced data show that TTLNC significantly outperforms other four state-of-the-art noise correction algorithms in team of the noise ratio and the model quality.

作者杨艺蒋良孝李超群李宏伟 YANG Yi;JIANG Liang-xiao;LI Chao-qun;LI Hong-wei(School of Computer Science,China University of Geosciences,Wuhan,Hubei 430074,China;Hubei Key Laboratory of Intelligent Geo-Information Processing,China University of Geosciences,Wuhan,Hubei 430074,China;School of Mathematics and Physics,China University of Geosciences,Wuhan,Hubei 430074,China)

机构地区中国地质大学计算机学院智能地学信息处理湖北省重点实验室(中国地质大学) 中国地质大学数学与物理学院

出处《电子学报》 EI CAS CSCD 北大核心 2021年第3期424-434,共11页 Acta Electronica Sinica

基金国家自然科学基金联合基金(No.U1711267) 中央高校基本科研业务费专项资金(No.CUGGC03)。

关键词众包学习三重训练集成标记标记噪声噪声纠正噪声过滤 crowdsourcing learning tri-training integrated labels label noise noise correction noise filtering

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

同被引文献5

1邓超,郭茂祖.基于自适应数据剪辑策略的Tri-training算法[J].计算机学报,2007,30(8):1213-1226. 被引量：15
2邓超,郭茂祖.基于Tri-Training和数据剪辑的半监督聚类算法[J].软件学报,2008,19(3):663-673. 被引量：30
3周志华.基于分歧的半监督学习[J].自动化学报,2013,39(11):1871-1878. 被引量：87
4许勐璠,李兴华,刘海,钟成,马建峰.基于半监督学习和信息增益率的入侵检测方案[J].计算机研究与发展,2017,54(10):2255-2267. 被引量：26
5张永,陈蓉蓉,张晶.基于交叉熵的安全Tri-training算法[J].计算机研究与发展,2021,58(1):60-69. 被引量：7

引证文献1

1李松,吴润秀,康平,赵嘉.基于自适应剪辑与概率参数的Tri-Training算法[J].江西师范大学学报（自然科学版）,2023,47(5):490-496. 被引量：1

二级引证文献1

1吕思雨,赵嘉,吴烈阳,张翼英,韩龙哲.基于特征加权混合隶属度的模糊孪生支持向量机[J].南昌工程学院学报,2024,43(1):93-101. 被引量：1

1魏一娟,王小鹏,侯平,查开继,高剑波.宽体探测器CT降低金属植入物所致颌面部伪影[J].中国介入影像与治疗学,2021,18(3):165-169. 被引量：2
2冯晓菁,于慧锦.磁化转移技术联合3D-TOF-MRA对患者图像质量的影响[J].临床医学,2021,41(2):8-10. 被引量：1
3付永春,江滨,周一楠,陈红燕,陈绪珠.光谱CT头部虚拟平扫图像:不同单能量图像质量的对比[J].放射学实践,2021,36(4):546-550. 被引量：13
4郑世才.数字射线成像检测技术最佳透照电压的进一步讨论[J].无损探伤,2021,45(2):11-15. 被引量：1

电子学报

2021年第3期

浏览历史

内容加载中请稍等...

一种基于Tri-training的众包标记噪声纠正算法被引量：1

同被引文献5

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

一种基于Tri-training的众包标记噪声纠正算法 被引量：1

同被引文献5

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

一种基于Tri-training的众包标记噪声纠正算法被引量：1