基于Tri-training的半监督多标记学习算法被引量：4

Semi-supervised multi-label learning algorithm based on Tri-training

下载PDF

导出

摘要传统的多标记学习是监督意义下的学习,它要求获得完整的类别标记.但是当数据规模较大且类别数目较多时,获得完整类别标记的训练样本集是非常困难的.因而,在半监督协同训练思想的框架下,提出了基于Tri-training的半监督多标记学习算法(SMLT).在学习阶段,SMLT引入一个虚拟类标记,然后针对每一对类别标记,利用协同训练机制Tri-training算法训练得到对应的分类器;在预测阶段,给定一个新的样本,将其代入上述所得的分类器中,根据类别标记得票数的多少将多标记学习问题转化为标记排序问题,并将虚拟类标记的得票数作为阈值对标记排序结果进行划分.在UCI中4个常用的多标记数据集上的对比实验表明,SMLT算法在4个评价指标上的性能大多优于其他对比算法,验证了该算法的有效性. Traditional multi-label learning is in the sense of supervision , in which the complete category labels arerequired.However, when the size of data is large and there are several categories of labels , it is quite difficult toobtain the training sample sets with complete labels .Therefore, a semi-supervised multi-label learning algorithmbased on Tri-training （SMLT） is proposed.In the learning stage, SMLT initially introduces a virtual label, then foreach pair of virtual labels, the Tri-training algorithm is utilized to train the corresponding classifiers for each pair oflabels.In the forecast stage, a new sample is given, which will be substituted into the obtained classifier describedabove.According to the votes of each label, the multi-label learning problem is transformed into a label rankingproblem, subsequently; the votes of the virtual label are taken as the threshold for distinguishing the label rankingresults.The contrast experiments on four commonly used UCI multi -label datasets show the SMLT algorithm behavesbetter than other comparative algorithms in four evaluation indices and the effectiveness of the proposed algorithm isverified.

作者刘杨磊梁吉业高嘉伟杨静

机构地区山西大学计算机与信息技术学院山西大学计算智能与中文信息处理教育部重点实验室

出处《智能系统学报》 CSCD 北大核心 2013年第5期439-445,共7页 CAAI Transactions on Intelligent Systems

基金国家"973"计划前期研究专项(2011CB311805) 山西省科技攻关计划资助项目(20110321027-01) 山西省科技基础条件平台建设项目(2012091002-0101)

关键词多标记学习半监督学习 TRI-TRAINING multi-label learning semi-supervised learning Tri-training

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献19

1TSOUMAKAS G, KATAKIS I. Multi-label classification: an overview[J]. International Journal of Data Warehousing and Mining, 2007, 3(3): 1-13..
2ZHU Xiaojin. Semi-supervised learning literature survey [R]. Madison, USA: University of WisconsinMadison, 2008..
3常瑜,梁吉业,高嘉伟,杨静.一种基于Seeds集和成对约束的半监督聚类算法[J].南京大学学报（自然科学版）,2012,48(4):405-411. 被引量：7
4ZHOU Zhihua, ZHANG Minling, HUANG Shengjun, et al. Multi-instance multi-label learning[J]. Artificial Intelligence, 2012, 176(1): 2291-2320..
5ZHANG Minling, ZHANG Kun. Multi-label learning by exploiting label dependency[C]//Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, DC, USA,2010: 999-1007..
6BOUTELL M R, LUO Jiebo, SHEN Xipeng, et al. Learning multi-label scene classification[J]. Pattern Recognition, 2004, 37(9): 1757-1771..
7FURNKRANZ J, HULLERMEIER E, MENCIA E L, et al. Multi-label classification via calibrated label ranking[J]. Machine Learning, 2008, 73(2): 133-153..
8TSOUMAKAS G, VLAHAVAS I. Random k-labelsets: an ensemble method for multilabel classification[C]//Proceedings of the 18th European Conference on Machine Learning. Berlin: Springer, 2007: 406-417..
9ZHANG Minling, ZHOU Zhihua. ML-kNN: a lazy learning approach to multi-label learning[J]. Pattern Recognition, 2007, 40(7): 2038-2048.
10ELISSEEFF A, WESTON J. A kernel method for multi-labelled classification[M]//DIETTERICH T G, BECKER S, GHAHRAMANI Z. Advances in Neural Information Processing Systems 14. Cambridge, USA: The MIT Press, 2002: 681-687..

二级参考文献73

1姜远,周志华.基于词频分类器集成的文本分类方法[J].计算机研究与发展,2006,43(10):1681-1687. 被引量：22
2薛晓冰,韩洁凌,姜远,周志华.基于多示例学习技术的Web目录页面链接推荐[J].计算机研究与发展,2007,44(3):406-411. 被引量：6
3Schapire R E, Singer Y. Boostexter: A boosting-based system for text categorization [J]. Machine Learning, 2000, 39(2/3) : 135-168
4McCallum A. Multi-label text classification with a mixture model trained by EM [C]//Working Notes of the AAAI'99 Workshop on Text Learning. Menlo Park, CA.-AAAI Press, 1999
5Ueda N, Saito K. Parametric mixture models for multilabeled text [C]//Beeker S, Thrun S, Obermayer K. Advances in Neural Information Processing Systems 15 (NIPS'02). Cambridge, MA:MIT Press, 2003:721-728
6De Comite F, Gilleron R, Tommasi M. Learning multi label alternating decision trees from texts and data [C] //Proc of the 3rd Int Conf on Machine Learning and Data Mining in Pattern Recognition (MLDM'03). Berlin: Springer, 2003: 35-49
7Zhang M-L, Zhou Z-H. Multi-label neural networks with applications to functional genomics and text categorization[J]. IEEE Trans on Knowledge and Data Engineering, 2006, 18(10): 1338-1351
8Zhang M L, Zhou Z-H. ML-kNN: A lazy learning approach to multi-label learning [J]. Pattern Recognition, 2007, 40 (7) : 2038-2048
9Elisseeff A, Weston J. A kernel method for multi-labelled classification [C]//Dietterich T G, Becker S, Ghahramani Z. Advances in Neural Information Processing Systems 14 (NIPS'01). Cambridge, MA: MIT Press, 2002:681-687
10Boutell M R, Luo J, Shen X, et al. Learning multi-label scene classification [J]. Pattern Recognition, 2004, 37(9): 1757-1771

共引文献36

1魏维,魏敏,刘凤玉.概念间关联依赖多标记视频语义概念分类方法[J].中国图象图形学报,2010,15(6):893-899.
2孔祥南,黎铭,姜远,周志华.一种针对弱标记的直推式多标记分类方法[J].计算机研究与发展,2010,47(8):1392-1399. 被引量：13
3吕小勇,石洪波.基于频繁项集的多标签文本分类算法[J].计算机工程,2010,36(15):83-85. 被引量：4
4秦锋,黄俊,程泽凯.用于多标记学习的阈值确定算法[J].计算机工程,2010,36(21):214-216. 被引量：1
5周雒维,管春,卢伟国.多标签分类法在电能质量复合扰动分类中的应用[J].中国电机工程学报,2011,31(4):45-50. 被引量：35
6刘培奇,孙捷焓.基于LDA主题模型的标签传递算法[J].计算机应用,2012,32(2):403-406. 被引量：5
7李宇峰,黄圣君,周志华.一种基于正则化的半监督多标记学习方法[J].计算机研究与发展,2012,49(6):1272-1278. 被引量：18
8李凤英,李宏,李培.针对弱标记的多标记数据集成学习分类方法[J].微型机与应用,2012,31(13):73-75.
9潘俊,孔繁胜,王瑞琴.局部敏感判别直推学习机[J].浙江大学学报（工学版）,2012,46(6):987-994.
10田枫,沈旭昆,刘贤梅,周凯,杜睿山.一种基于弱标签的三维模型语义自动标注方法[J].系统仿真学报,2012,24(9):1873-1876. 被引量：3

同被引文献59

1钱志明,杨家宽,段连鑫.基于视频的车辆检测与跟踪研究进展[J].中南大学学报（自然科学版）,2013,44(S2):222-227. 被引量：13
2徐蓉,姜峰,姚鸿勋.流形学习概述[J].智能系统学报,2006,1(1):44-51. 被引量：67
3杨剑,王珏,钟宁.流形上的Laplacian半监督回归[J].计算机研究与发展,2007,44(7):1121-1127. 被引量：15
4Tsoumakas G, Katakis I. Multi-label classification:An overview[J]. International Journal of Data Warehou- sing and Mining: 2007,3(3): 1-13.
5Zhang Minling, Zhang K. Multi-label learning by ex- ploiting label dependency[C]//Proeeedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D. C., 2010, 999-1007.
6Zhu Xiaojin. Semi-supervised Learning Literature Sur- vey[R]. Madison University of Wisconsin,2008.
7Zhang Minling, ZhouZhihua. ML-kNN.. A lazy learn- ing approach to multi label learning[J]. Pattern Rec- ognition, 2007, 40(7): 2038-2048.
8Robert E. Schapire, Yoram Singer. BoosTexter: a boosting-based system for text categorization[J]. Ma- chine Learning, 2000, 39(2-3) :135-168.
9Amanda Clare, Ross D. King. Knowledge discovery in multi-label phenotype data[J]. Lecture Notes in Com- puter Science, 2001, 2168:42-53.
10Liu Yi, Jin Rong, Yang Liu. Semi-supervised multi- label learning by constrained non-negative matrix fac- torization[C]//Proceedings of the 21 st National Con- Ierence on ArtiIieial Intelligence. Menlo Park.. AAAI,2006 : 421-426.

引证文献4

1高嘉伟,梁吉业,刘杨磊,李茹.一种基于Tri-training的半监督多标记学习文档分类算法[J].中文信息学报,2015,29(1):104-110. 被引量：8
2蒋新华,高晟,廖律超,邹复民.半监督SVM分类算法的交通视频车辆检测方法[J].智能系统学报,2015,10(5):690-698. 被引量：6
3程康明,熊伟丽.一种双优选的半监督回归算法[J].智能系统学报,2019,14(4):689-696. 被引量：3
4程康明,熊伟丽.一种自训练框架下的三优选半监督回归算法[J].智能系统学报,2020,15(3):568-577. 被引量：2

二级引证文献18

1郭毅,黄磊.基于LPA和Tri-Training的半监督文本倾向性分类[J].北京交通大学学报,2015,39(6):114-121. 被引量：1
2李子彦,刘伟铭.一种基于局部HOG特征的运动车辆检测方法[J].广西师范大学学报（自然科学版）,2017,35(3):1-13. 被引量：7
3陶雯,王杉杉,李荣雨.基于多标记学习改进算法的入侵检测系统研究[J].自动化仪表,2017,38(9):57-60. 被引量：1
4高嘉伟,刘建敏.一种面向轨迹信息的时序数据流异常检测算法[J].计算机工程,2018,44(5):25-32. 被引量：4
5刘艳丽,王铁建.SVM算法在人脸识别中的应用研究[J].电脑知识与技术,2017,13(6X):176-177. 被引量：3
6王雷,杨思春.基于改进Tri-training算法的中文问句分类[J].安徽工业大学学报（自然科学版）,2016,33(2):172-176. 被引量：1
7熊礼平.基于物联网技术的车辆检测器设计[J].物联网技术,2018,8(9):92-95. 被引量：3
8王路,李寿山.基于变分自编码器的问题识别方法[J].郑州大学学报（理学版）,2019,51(3):79-84. 被引量：1
9阿力木江·艾沙,殷晓雨,库尔班·吾布力,李喆.Centroid和EM结合的半监督文本分类[J].计算机工程与设计,2019,40(11):3118-3123.
10厍向阳,韩伊娜.基于残差网络的小型车辆目标检测算法[J].计算机应用研究,2020,37(8):2556-2560. 被引量：8

1苟富,郑凯.基于K近邻统计的非线性AdaBoost算法[J].计算机应用,2015,35(9):2579-2583. 被引量：1
2全球重工业机器人市场有望超过20亿美元[J].磁性元件与电源,2016(8):73-73.
3孟海江.设置SMLT连接[J].网管员世界,2004(12):63-64.
4王羡慧,覃征,庄春晓,张选平.基于差异特征协同语义标注的三维模型检索方法[J].计算机辅助设计与图形学学报,2011,23(1):152-160. 被引量：4
5徐驰,徐燕凌.基于对象语义的图像分割和分类方法[J].重庆大学学报（自然科学版）,2006,29(8):98-101.
6刘丽娟,郑逢斌,郭珊珊,刘定一.基于神经网络的角色层次访问控制策略的研究[J].光盘技术,2008(8):33-34.
7王洪智,刘震,李东辉.基于多分类支持向量机的网络流量预测方法[J].科技导报,2014,32(17):60-63. 被引量：1
8谢科.融合协同训练和两层主动学习策略的SVM分类方法[J].湖南师范大学自然科学学报,2014,37(1):93-97. 被引量：1
9夏新运,田丽,李玲纯.变结构组合预测方法在短期电力负荷预测中的应用[J].自动化与仪器仪表,2009(6):69-70.
10马志伟,崔荣一,金小峰.基于半监督学习的行人检测方法研究[J].软件,2012,33(6):23-26. 被引量：1

智能系统学报

2013年第5期

浏览历史

内容加载中请稍等...

基于Tri-training的半监督多标记学习算法被引量：4

参考文献19

二级参考文献73

共引文献36

同被引文献59

引证文献4

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

基于Tri-training的半监督多标记学习算法 被引量：4

参考文献19

二级参考文献73

共引文献36

同被引文献59

引证文献4

二级引证文献18

相关作者

相关机构

相关主题

浏览历史

基于Tri-training的半监督多标记学习算法被引量：4