期刊文献+

大规模层次分类问题研究及其进展 被引量:14

Research and Development of Large Scale Hierarchical Classification Problem
下载PDF
导出
摘要 随着信息技术的发展,互联网数据急剧增长.为了有效地组织和管理这些海量网页信息,通常按照一个大规模的概念或主题类别层次对网络上的信息进行分类,以更好地搜索和访问这些网络资源.在这个过程中,大规模层次分类问题研究如何将互联网上的网页文档准确地分到类别层次中的各个类别.该文对大规模层次分类问题进行了分析.首先,给出了大规模层次分类问题的定义,分析了大规模层次分类问题的求解策略;其次,对大规模层次分类问题的求解方法加以分类,在分类基础上,介绍了各种典型的求解方法并进行了对比;最后总结了各种大规模层次分类问题求解方法并指出了未来的研究方向. With the development of information technology, Web information management and access become much difficult to some extent as rapid increase in Internet data. A large scale class hierarchy of concepts or topics was used to label the web information to make information access easier. In this process, large scale hierarchical classification problem researches how to classify the Web documents into the categories among the class hierarchy, which is surveyed in this pa- per. Firstly, a definition of large scale hierarchical classification problem is proposed, which is used to describe the problem in abstraction level. Meanwhile, strategies for conquering the prob- lem are also investigated. Secondly, classification of solving methods for this problem is ana- lyzed, and on the basis of the classification, many typical solving methods are introduced and compared. Lastly, future research trends of the solving methods for this problem are reviewed.
出处 《计算机学报》 EI CSCD 北大核心 2012年第10期2101-2115,共15页 Chinese Journal of Computers
基金 国家"八六三"高技术研究发展计划项目基金(2010AA012505 2011AA010702 2012AA01A401 2012AA01A402) 国家自然科学基金(60933005) 国家科技支撑计划(2012BAH38B04) 国家242信息安全计划(2011A010)资助~~
关键词 文本分类 大规模层次分类 类别层次 类别层次树 text categorization large scale hierarchical classification class hierarchy tree-struc tured class hierarchy
  • 相关文献

参考文献44

  • 1Silla C N, Freitas A A. A survey of hierarchical classification across different application domains. Data Mining and Knowledge Discovery, 2010, 22(1-2): 31-72.
  • 2Guan Hu, Zhou Jing-Yu, Guo Min-Yi. A class-feature-cen- troid classifier for text categorization//Proceedings of the 18th international conference on World Wide Web. Madrid, Spain, 2009:201-210.
  • 3Wang Xiao-Lin, Zhao Hai, Lu Bao-Liang. Enhance K Nea- rest neighbor algorithm for large-scale multi-labeled hierar- chical classification//Proceedings of the 2011 European Con- ference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Athens, Greece, 2011: 58-66.
  • 4Zhang Cong-Le, Xue Gui-Rong, YongZu et al. Web-scale classification with Naive Bayes//Proceedings of the 18th In- ternational Conference on World Wide Web. Madrid, Spain, 2009 : 1083-1084.
  • 5Labrou Y, Finin T W. Yahoo! as an ontology: Using Yahoo! Categories to describe documents//Proceedings of the 8th International Conference on Information and Knowl- edge Management. Kansas City, USA, 1999: 180-187.
  • 6Christophe Brouard. ECHO at the LSHTC pascal challenge 2//Proceedings of the 2011 European Conference on Machine Learning and Principles and Practice of Knowledge Diseovery in Databases. Athens, Greece, 2011:49-57.
  • 7Madani O, Huang Jian. Large-scale many-class prediction via flat teehniques//Proeeedings of the Large-Seale Hierar- ehieal Classification Workshop in the 32nd European Confer- ence on Information Retrial. Milton Keynes, UK, 2010:1-6.
  • 8Crammer K, Dekel O, Keshet J et al. Online passive-aggres- sive algorithms. Journal of Machine Learning Research, 2006 (7) : 551-585.
  • 9Wang Ke, Zhou Sen-Qiang, He Yu. Hierarchical classifica- tion of real life documents//Proceedings of the 1st Society for Industry and Applied Mathematics International Conference on Data Mining. Chicago, USA, 2001:1-16.
  • 10Cai Li-Juan, Hofmann T. Hierarchical document categoriza- tion with Support Vector Machines//Proceedings of the lath ACM International Conference on Information and Knowl- edge Management. Washington, USA, 2004:78-87.

二级参考文献37

  • 1袁时金,李荣陆,周水庚,胡运发.层次化中文文档分类[J].通信学报,2004,25(11):55-63. 被引量:6
  • 2高波,赵政.文本层次分类系统的研究[J].计算机工程与应用,2006,42(11):176-178. 被引量:5
  • 3肖雪,何中市.基于向量空间模型的中文文本层次分类方法研究[J].计算机应用,2006,26(5):1125-1126. 被引量:12
  • 4SEBASTIANI F. Machine learning in automated text cate-gorization [ J ]. ACM Computing Surveys, 2002, 34 ( 1 ) : 1-47.
  • 5LIU Tieyan, YANG Yiming, WAN Hao, et al. An ex- perimental study on large-scale web categorization [ C ]// WWW'05 Special Interest Tracks and Posters of the 14th International Conference on World Wide Web. New York: ACM Press, 2005 : 1106-1107.
  • 6SILLA C N, FREITAS A A. A survey of hierarchical classification across different application domains [ J ]. Data Mining and Knowledge Discovery, 2010, 22 ( 1-2 ) : 31-72.
  • 7XUE Guirong, XING Dikan, YANG Qiang, et al. Deep classification in large-scale text hierarchies [ C ]// Pro- ceedings of the 31 st annual international ACM SIGIR con- ference on Research and Development in Information Re- trieval. New York, ACM Press. 2008. 619-626.
  • 8OH H S, CHOI Y J, MYAENG S H. Combining global and local information for enhanced deep classification [ C ]//Proceeding of the 2010 ACM Symposium on Ap- plied Computing. New York: ACM Press, 2010: 1760- 1767.
  • 9Dumais ST, et al. Using latent semantic analysis to improve information retrieval. CHT$8 Proceedings, 1988,281-285.
  • 10S. Dumains, H. Chen. Hierarchical Classification of Web Content. Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval,2000, 256 - 263.

共引文献16

同被引文献107

引证文献14

二级引证文献67

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部