期刊文献+

一种基于QBC的SVM主动学习算法 被引量:8

Active learning algorithm for SVM based on QBC
下载PDF
导出
摘要 针对支持向量机(souport vector machine,SVM)训练学习过程中样本分布不均衡、难以获得大量带有类标注样本的问题,提出一种基于委员会投票选择(query by committee,QBC)的SVM主动学习算法QBC-ASVM,将改进的QBC主动学习方法与加权SVM方法有机地结合应用于SVM训练学习中,通过改进的QBC主动学习,主动选择那些对当前SVM分类器最有价值的样本进行标注,在SVM主动学习中应用改进的加权SVM,减少了样本分布不均衡对SVM主动学习性能的影响,实验结果表明在保证不影响分类精度的情况下,所提出的算法需要标记的样本数量大大少于随机采样法需要标记的样本数量,降低了学习的样本标记代价,提高了SVM泛化性能而且训练速度同样有所提高。 To the problem that large-scale labeled samples is not easy to acquire and the class-unbalanced dataset in the course of souport vector machine (SVM) training, an active learning algorithm based on query by committee (QBC) for SVM(QBC-ASVM) is proposed, which efficiently combines the improved QBC active learning and the weighted SVM. In this method,QBC active learning is used to select the samples which are the most valuable to the current SVM classifier,and the weighted SVM is used to reduce the impact of the unba- lanced data set on SVMs active learning. The experimental results show that the proposed approach can consid- erably reduce the labeled samples and costs compared with the passive SVM, and at the same time, it can ensure that the accurate classification performance is kept as the passive SVM, and the proposed method improves gen- eralization performance and also expedites the SVM training.
出处 《系统工程与电子技术》 EI CSCD 北大核心 2015年第12期2865-2871,共7页 Systems Engineering and Electronics
基金 国家自然科学基金(61273275)资助课题
关键词 主动学习 支持向量机 委员会投票选择算法 分类 active learning support vector machine (SVM) query by committee (QBC) classification
  • 相关文献

参考文献17

二级参考文献72

  • 1张翔,肖小玲,徐光祐.基于最大熵估计的支持向量机概率建模[J].控制与决策,2006,21(7):767-770. 被引量:12
  • 2Vapnik V. The nature of statistical learning theory[M]. New York: Springer Press, 1995.
  • 3Cohn D A, Ghahramani Z, Jordan M I. Active learning with statistical models[J]. J of Artificial Intelligence Research, 1996, 4: 129-145.
  • 4Roy N, McCallum A K. Toward optimal active learning through sampling estimation of error reduction[C]. Proc of 18th Int Conf on Machine Learning. San Francisco: Morgan Kaufmann, 2001: 441-448.
  • 5Lewis D D, Gale W. A sequential algorithm for training text classifiers [C]. Proc of 17th Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. Dublin: Springer-Verlag, 1994: 3-12.
  • 6Seung H S, Opper M, Sompolinsky H. Query by committee[C]. Proc of 15th Annual ACM Workshop on Computational Learning Theory. Pittsburgh: Morgan Kaufmann, 1992: 287-294.
  • 7Freund Y, Seung H S, Samir E, et al. Selective sampling using the query by committee algorithm[J]. Machine Learning, 1997, 28(2/3): 133-168.
  • 8Seeger M. Learning with labeled and unlabeled data[DB/OL]. Technical Report. UK: Edinburgh University, 2001. http:// www.cs.berkeley.edu/-mseeger/papers/review.pdf.
  • 9Cohen I, Cozman F G, Bronstein A. The effect of unlabeled data on generative classifiers, with application to model selection[DB/OL]. Technique Report, Palo Alto: HP Laboratories, 2002. http:// www.hpl.hp.com/techreports/2002/HPL-2002-140.html.
  • 10Nigam K, McCallum A, Thrun S, et al. Text classification from labeled and unlabeled documents using EM[J]. Machine Learning, 2000, 39(2): 103-134.

共引文献63

同被引文献67

  • 1龙军,殷建平,祝恩,赵文涛.主动学习研究综述[J].计算机研究与发展,2008,45(z1):300-304. 被引量:31
  • 2赵悦,穆志纯.基于委员会投票选择方法的主动学习的研究[J].太原理工大学学报,2006,37(4):469-472. 被引量:7
  • 3施化吉,周书勇,李星毅,唐慧,丁秋林.基于平均密度的孤立点检测研究[J].电子科技大学学报,2007,36(6):1286-1288. 被引量:11
  • 4Hady M F A,Schwenker F.Co-training by committee:anew semi-supervised learning framework[C].IEEE InternationalConference on Data Mining Workshops,2008:563-572.
  • 5Wang Shuang,Wu Linsheng,Jiao Licheng,et al.Improvethe performance of co-training by committee with refinementof class probability estimations[J].Neurocomputing,2014,136(8):30-40.
  • 6Liu Kun,Guo Yuwei,Wang Shuang,et al.Semi-supervisedlearning based on improved co-training by committee[C].Intelligence Science and Big Data Engineering,2015:413-421.
  • 7Shi Lei,Ma Xinming,Xi Lei,et al.Rough set and ensemblelearning based semi-supervised algorithm for text classification[J].Expert Systems with Applications,2011,38(5):6300-6306.
  • 8Fan Xinhua,Guo Zhiyi,Ma Houfeng.An improved EM-basedsemi-supervised learning method[C].International Joint Conferenceon Bioinformatics,Systems Biology and IntelligentComputing,2009:529-532.
  • 9Fan Xinhua,Guo Zhiyi,Ma Houfeng.A semi-supervisedtext classification method based on incremental EM algorithm[C].International Conference on Information Engineering,2010,2:211-214.
  • 10Lewis D D,Gale W A.A sequential algorithm for trainingtext classifiers[C].Proceedings of the Seventeenth AnnualInternational ACM-SIGIR Conference on Research andDevelopment in Information Retrieval,1994,29(2):3-12.

引证文献8

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部