基于SVM期望间隔的多标签分类的主动学习被引量：7

Active Learning for Multi-label Classification Based on SVM's Expect Margin

下载PDF

导出

摘要分类是数据挖掘领域研究中的核心技术之一。得到一个性能良好的分类器需要大量的训练样本,而对样本进行标记是一个十分消耗资源的过程,对多标签样本进行标记就更加困难。为了尽可能降低标记样本的成本,需要找出最能代表类别信息的样本。在基于SVM的分类方法中,分类器间隔越大,分类的精度就会越差。提出了一种基于期望间隔的主动学习方法,即依据当前分类器,选择最快缩小分类间隔的样本。通过实验证明,基于期望间隔的学习策略比基于决策值以及基于后验概率的策略有着更好的学习效果。 Classification is one of the key techniques of data mining.It requires a large number of training samples to obtain a favorable classifier,but it is resource-consuming to create label for each sample,it is even more so for multi-label samples.In order to reduce costs,it should find the most informative samples which can represent the classes.The classifiers which are based on SVM,the larger margin,the classifier＇s accuracy will be poorer.This paper proposed an active learning method based on SVM＇s expect margin which relies on current classifier,select samples that can reduce classifier＇s margin fastest.The experimental results show that the method based on expect margin outperforms than other active learning strategy based on decision value and posterior probability strategy.

作者刘端阳邱卫杰

机构地区浙江工业大学计算机科学与技术学院

出处《计算机科学》 CSCD 北大核心 2011年第4期230-232,266,共4页 Computer Science

关键词多标签后验概率期望间隔主动学习支持向量机 Multi-label Posterior probability Expect margin Active learning SVM

分类号 TP3 [自动化与计算机技术—计算机科学与技术]

引文网络
相关文献

参考文献11

1Yang Bisan, Jiao-Tao, Wang Teng-jiao, et al. Effective Multi-Label Active Learning for Text Classification[C].//KDD' 09: Proeeedings of the 15^th ACM SIGKDD international conference on Knowledge discovery and data mining. Paris, 2009:917-926.
2袁勋,吴秀清,洪日昌,宋彦,华先胜.基于主动学习SVM分类器的视频分类[J].中国科学技术大学学报,2009,39(5):473-478. 被引量：21
3宋鑫颖周志逵.一种基于SVM的主动学习文本分类方法.计算机科学,2006,:288-290.
4Li Xu-chun, Wang Lei, Sung E. Multi-Label SVM Active Learning for Image Classification[C].//International Conference on Image Processing. Lion, 2004: 2207-2210.
5Brinker K. On Active Learning in Multi-label Clarification [ M ].// Myra Spiliopoulou, Rudolf Kruse, Christian t3orgelt, et al. "From Data and Information Analysis to Know/edge Engineering" of Book Series "Studies in Classification, Data Analysis, and Knowledge Organization". Berlin Heidelberg: Springer, 2006: 206-213.
6Singh M, Curran E, Cunningham P. Active Learning for Multilabel Image Annotation[C].//The 19th Irish Conference on Artificial Intelligence and Cognitive Science. Cork, Ireland, 2008: 173-182.
7Joshi A J, Porikli F, Papanikolopoulos N. Multi-Class Active Learning for Image Classification[ C].//IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Miami, 2009 : 2372-2379.
8Lin Hsuan-tien, Lin Chih-jen, Weng R C. A Note on Platt's Probabilistic Outputs for Support Vector Machines[J]. Journal of Machine Learning Research, 2007,68(3) : 267-276.
9Tong S, Koller D. Support Vector Machine Active Learning with Applications to Text Classification [J]. Journal of Machine Learning Research, 2001 : 45-66.
10He Jing-rui, Li Ming-jing, Zhang Hong-jiang, et al. Mean Version Space: a New Active Learning Method for Content-Based Image Retrieval[C].//Proceeding of the ACM SIGMM International Workshop on Multimedia Information Retrieval(MIR) at the International Multimedia Conference. 2004:15-22.

二级参考文献16

1Tong S. Active learning: theory and applications[D]. Ph. D. dissertation, Stanford University, 2001.
2Brinker K. On multiclass active learning with support vector machines [C]// Proceedings of European Conference on Artificial Intelligence. 2004: 969-970.
3Yan R, Yang J, Hauptmann A. Automatically labeling video data using multi-class active learning [C]// Proceedings of the 9th IEEE International Conference on Computer Vision. Washington: IEEE Computer Society, 2003, 1: 516-523.
4Zhang H J, Kankanhallli A, Smoliar S W. Automatic partitioning of full-motion video [J]. Multimedia Systems, 1993, 1(1). pp: 10-28.
5Lan D J, Ma Y F, Zhang H J. A novel motion-based representation for video mining[C]// Proceedings of IEEE International Conference on Multimedia & Expo. Washington: IEEE Computer Society, 2003: 469-472.
6Vapnik V. Statistical Learning Theory [M]. New York: Wiley, 1998.
7Wu T F, Lin C H, Weng R C. Probability estimates for multi-class classification by pairwise coupling[J]. The Journal of Machine Learning Research. 2004, 5: 975-1005.
8Lin H T, Lin C J, Weng R C. A note on Platt's probabilistic outputs for support vector machines[J]. Machine Learning, 2007, 68(3):267-276.
9Osuna E, Freund R, Girosi K An improved training algorithm for support vector machines[C]//Proceedings of IEEE Workshop on Neural Networks for Signal Processing Amelia Island, USA: IEEE Press, 1997: 276-285.
10Truong B T, Venkatesh S, Dorai C. Automatic genre identification for content-based video categorization [C]// 15th International Conference on Pattern Recognition. Washington: IEEE Computer Society, 2000, 4: 230-233.

共引文献21

1董振兴,李荣,陈龙.一种基于主动学习和TCM-EKNN的邮件过滤方法[J].重庆邮电大学学报（自然科学版）,2011,23(1):85-90.
2刘端阳,邱卫杰.基于加权SVM主动学习的多标签分类[J].计算机工程,2011,37(8):181-182. 被引量：7
3车万翔,张梅山,刘挺.基于主动学习的中文依存句法分析[J].中文信息学报,2012,26(2):18-22. 被引量：10
4蒋华,戚玉顺.基于球结构支持向量机的多标签分类的主动学习[J].计算机应用,2012,32(5):1359-1361. 被引量：3
5李楠.一种光照变化条件下的人脸识别方法[J].计算机与现代化,2012(7):104-106. 被引量：3
6孔英会,刘淑荣,张少明,范启跃.基于语义的视频检索关键技术综述[J].电子科技,2012,25(8):150-153. 被引量：3
7蒋华,戚玉顺,曾梅梅.球结构支持向量机的主动自适应方法[J].计算机工程与设计,2012,33(11):4116-4120. 被引量：1
8张建明,孙春梅,闫婷.基于自适应SVM的半监督主动学习视频标注[J].计算机工程,2013,39(8):190-195. 被引量：3
9孟光胜,赵志宇.基于两层主动学习策略的SVM分类方法[J].河南师范大学学报（自然科学版）,2014,42(2):158-162. 被引量：1
10谢科.融合协同训练和两层主动学习策略的SVM分类方法[J].湖南师范大学自然科学学报,2014,37(1):93-97. 被引量：1

同被引文献56

1苏高利,邓芳萍.关于支持向量回归机的模型选择[J].科技通报,2006,22(2):154-158. 被引量：59
2赵英刚,陈奇,何钦铭.一种基于支持向量机的直推式学习算法[J].江南大学学报（自然科学版）,2006,5(4):441-444. 被引量：8
3何鸣,李国正,袁捷,吴耿锋.基于主成份分析的Bagging集成学习方法[J].上海大学学报（自然科学版）,2006,12(4):415-418. 被引量：8
4虞凡,杨利英,覃征.异构集成学习中的观察学习机制研究(英文)[J].广西师范大学学报（自然科学版）,2006,24(4):54-57. 被引量：3
5何鸣,李国正,袁捷.医学诊断中集成学习技术的研究[J].计算机工程与应用,2006,42(28):218-220. 被引量：5
6VAPNIK V N.统计学习理论[M].许建华,张学工,译.北京:电子工业出版社,2004.
7BOUTELL M R, LUO JIEBO, SHEN XIPENG, et al. Learning multi-label scene classification[ J]. Pattern Recognition, 2004, 37 (9): 1757 - 1771.
8SNOEK C, WORRING M, GEMERT J V, et al. The challenge problem for automated detection of 101 semantic concepts in multimedia [ C] // Proceedings of the 14th Annual ACM International Conference on Multimedia. New York: ACM Press, 2006:421 -430.
9LI XUCHUN, WANG LEI, SUNG E. Muhi-label SVM active learning for image classification[ C]// Proceedings of International Conference on Image Processing. Washington, DC: IEEE Computer Society, 2004:2207 -2210.
10HULLERMEIER E, FURNKRANZ J, CHENG WEIWEI, et al. Label ranking by learning pairwise preferences[ J]. Artificial Intelligence, 2008, 172(16) : 1897 - 1916.

引证文献7

1蒋华,戚玉顺.基于球结构支持向量机的多标签分类的主动学习[J].计算机应用,2012,32(5):1359-1361. 被引量：3
2蒋华,戚玉顺.基于球结构SVM的多标签分类[J].计算机工程,2013,39(1):294-297. 被引量：6
3李琼,陈利,王维虎.基于SVM的手写体数字快速识别方法研究[J].计算机技术与发展,2014,24(2):205-208. 被引量：19
4孟光胜,赵志宇.基于两层主动学习策略的SVM分类方法[J].河南师范大学学报（自然科学版）,2014,42(2):158-162. 被引量：1
5谢科.融合协同训练和两层主动学习策略的SVM分类方法[J].湖南师范大学自然科学学报,2014,37(1):93-97. 被引量：1
6霍东雪,刘辉,尚振宏,李润鑫.一种异构集成学习的儿科疾病诊断方法研究[J].计算机应用与软件,2018,35(6):54-57. 被引量：7
7李翼宏,刘方正,杜镇宇.一种改进主动学习的恶意代码检测算法[J].计算机科学,2019,46(5):92-99. 被引量：7

二级引证文献43

1谈笑.基于Spark大数据平台的老年病风险预警模型[J].微型电脑应用,2020,36(2):71-74. 被引量：2
2贾立鹏,王凤英,姜倩玉.基于多特征融合和集成学习的恶意代码检测研究[J].中国科技论文在线精品论文,2021(2):168-176. 被引量：1
3方向,陈思佳,贾颖.基于概率测度支持向量机的静态手写数字识别方法[J].微电子学与计算机,2015,32(4):107-110. 被引量：7
4刘康,钱旭,王自强.基于流形主动学习的遥感图像分类算法[J].计算机应用,2013,33(2):326-328. 被引量：4
5李琼,陈利,王维虎.基于SVM的手写体数字快速识别方法研究[J].计算机技术与发展,2014,24(2):205-208. 被引量：19
6张鹏,谢晓尧.基于改进的C-支持向量机的手写体数字高识别率方法研究[J].贵州师范大学学报（自然科学版）,2014,32(2):95-98. 被引量：4
7蒋超.基于深度学习的物体实时检测模块设计与在安卓系统上的实现[J].科学技术创新,2019(2):76-78.
8胡吉明,陈果.超球支持向量机文本分类方法改进[J].现代图书情报技术,2014(9):74-80. 被引量：3
9徐美香,孙福明,李豪杰.主动学习的多标签图像在线分类[J].中国图象图形学报,2015,20(2):237-244. 被引量：5
10张超,陈利,李琼.一种PST_LDA中文文本相似度计算方法[J].计算机应用研究,2016,33(2):375-377. 被引量：18

1刘端阳,邱卫杰.基于平均期望间隔的多标签分类主动学习方法[J].计算机工程,2011,37(15):168-170. 被引量：1

计算机科学

2011年第4期

浏览历史

内容加载中请稍等...

基于SVM期望间隔的多标签分类的主动学习被引量：7

参考文献11

二级参考文献16

共引文献21

同被引文献56

引证文献7

二级引证文献43

相关作者

相关机构

相关主题

浏览历史

基于SVM期望间隔的多标签分类的主动学习 被引量：7

参考文献11

二级参考文献16

共引文献21

同被引文献56

引证文献7

二级引证文献43

相关作者

相关机构

相关主题

浏览历史

基于SVM期望间隔的多标签分类的主动学习被引量：7