期刊文献+

一种具有增量学习能力的PU主动学习算法 被引量:1

PU Active Learning Algorithm with Incremental Learning Ability
下载PDF
导出
摘要 在正例和无标记样本增量学习中,初始正例样本较少且不同类别正例的反例获取困难,使分类器的分类和泛化能力不强,为解决上述问题,提出一种具有增量学习能力的PU主动学习算法,在使用3个支持向量机进行协同半监督学习的同时,利用基于网格的聚类方法进行无监督学习,当分类与聚类结果不一致时,引入主动学习对无标记样本进行标记。实验结果表明,将该算法应用于Deep Web入口的在线判断和分类能有效提高入口判断的准确性及分类的正确性。 In positive and unlabeled samples of incremental learning, the initial positive samples are small and positive cases of different types of cases are difficult to get, making classifier classification ability and generalization ability weak. A new algorithm called PU Active Learning algorithm with Incremental learning ability(l-PUAL) is presented, which is applied to Deep Web sources on-line judgments and classification. Experimental results show that it can take advantage of online unlabeled samples to improve the accuracy of judgments and classification correctness.
出处 《计算机工程》 CAS CSCD 北大核心 2011年第4期214-215,226,共3页 Computer Engineering
关键词 PU学习 支持向量机 基于网格的聚类 PU learning SVM grid-based clustering
  • 相关文献

参考文献9

  • 1Peng Tao, Zuo Wanli, He Fengling. SVM-based Adaptive Learning Method for Text Classification from Positive and Unlabeled Documents[J]. Knowledge and Information Systems, 2008, 16(3): 281-301.
  • 2Zhang Bangzuo, Zuo Wanli. Tri-training Based Learning from Positive and Unlabeled Data[C]//Proc. of 2008 International Symposiums on Information Processing and 2008 International Pacific Workshop on Web Mining and Web-based Application. Moscow, Russia: [s. n.], 2008: 650-654.
  • 3Zhou Zhihua, Li Ming. Tri-training: Exploiting Unlabeled Data Using Three Classifiers[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(11): 1529-1541.
  • 4Wang Wei, Yang Jiong, Muntz R. STING: A Statistical Informa- tion Grid Approach to Spatial Data Mining[C]//Proc. of the 23rd Conference on VLDB. Athens, Greece: Is. n.I, 1997.
  • 5Li Kunlun, Zhang Wei, Ma Xiaotao, et al. A Novel Semi- supervised SVM Based on Tri-training[C]//Proc. of the 2nd International Symposium on Intelligent Information Technology Application. Shanghai, China: [s. n.], 2008.
  • 6张世明,覃正,徐和祥,夏德元.基于Deep Web的教育资源检索系统[J].计算机工程,2010,36(3):76-78. 被引量:1
  • 7马军,宋玲,韩晓晖,闫泼.基于网页上下文的Deep Web数据库分类[J].软件学报,2008,19(2):267-274. 被引量:31
  • 8王辉,刘艳威,左万利.使用分类器自动发现特定领域的深度网入口(英文)[J].软件学报,2008,19(2):246-256. 被引量:14
  • 9Li Zhitao, Liu Quan, Cui Zhiming, et al. A Method to Automa- tically Discover and Classify Deep Web Data Source Using Multi-classifier[C]/Proc. of 2009 WRI World Congress on Computer Science and Information Engineering. Los Angeles, California, USA: [s. n.], 2009.

二级参考文献48

  • 1教育部基础教育课程教材发展中心.CELTS-42-2006基础教育教学资源元数据应用规范[S].2006.
  • 2Liu Wei, Meng Xiaofeng, Meng Weiyi. Vision-based Web Data Record Extraction[EB/OL]. (2006-06-30). http://www.cs.bingham ton.edu/-meng/pub.d/WebDBCamera.pdf.
  • 3Weber R.HTTPClient客户端编程工具包应用[EB/OL].(2007-12-02).http://wikiapache.org/jakarta-httpclient/HttpClient.
  • 4Gravano L, Garcia-Molina H, Tomasic A. Gloss: Textsource discovery over the Intemet. ACM Trans. on Database Systems, 1999, 24(2):229-246.
  • 5Yi L, Liu B. Web page cleaning for Web mining through feature weighting. In: Cohn AG, ed. Proc. of the 18th Int'l Joint Conf. on Artificial Intelligence (IJCAI 2003). Acapulco: Kluwier Academic Publisher, 2003.64-75.
  • 6Bergholz A, Chidlovskii B. Crawling for domain-specific hidden Web resources. In: Spaccapietra S, ed. Proc. of the 4th Int'l Conf. on Web Information Systems Engineering. Rome: IEEE Computer Society, 2003. 125-133.
  • 7Barbosa L, Freire J, Silva A. Organizing hidden-Web databases by clustering visible Web documents. In: Doqac A, ed. Proc. of IEEE the 23rd Int'l Conf. on Data Engineering. Istanbul: IEEE Computer Society, 2007. 326-335.
  • 8Gravano L, Ipeirotis PG, Sahami M. QProber: A system for automatic classification of hidden-Web databases. ACM TOIS, 2003, 21(1):1-41.
  • 9He B, Tao T, Chang KCC. Organizing structured Web sources by query schemas: A clustering approach. In: Oravano L, ed. Proc. of ACM the 13th Conf. on Information and Knowlege Management. Washington: ACM Press, 2004.22-31.
  • 10Baeza-Yates R, Ribeiro-Neto B. Modem Information Retrieval. Boston: Addison Wesley, 1999. 27-30.

共引文献39

同被引文献11

引证文献1

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部