期刊文献+

面向科研与教学的文本分类平台构建

Building a Text Classification Platform for Scientific Research and Teaching
下载PDF
导出
摘要 为提高中文文本分类科研与教学人员的工作效率,本文针对国内现有中文文本分类系统的研发现状,构建一个包括预处理、特征选择、权值计算、自动分类和分类效果测评等文本分类全过程的管理平台。开发过程中,本文使用系统集成思想和方法将自编软件代码与相关的开源软件代码进行集成。经测试,该系统实现了文本自动分类过程的全部功能。 In order to improve the working efficiency of the people which are occupied in scientific research and teaching of Chinese text categorization and considering about the research and development status of the text categorization system in China, a management platform of text categorization for the whole process, including pre- processing, feature selection, weighting calcula- tion, automatic classification and classification evaluation were built. In the process of the development, based on the principle and method of system integration, the coding of ourselves and the ones of the related open source software were integrated. After testing, the system implemented the whole functions of automatic text categorization.
出处 《现代情报》 CSSCI 北大核心 2015年第9期56-62,78,共8页 Journal of Modern Information
基金 国家自然科学基金项目"面向文本分类的多学科协同建模理论与实验研究"(项目编号:71373291)的研究成果之一
关键词 文本分类 MVC 语料库 训练集 测试集 text classification MVC corpus training set testing set
  • 相关文献

参考文献15

  • 1Elsayed E, Eldahshan K, Tawfeck S. Automatic evaluation technique for certain types of open questions in semantic learning systems [ J ]. Human - centric Computing and Information Sciences, 2013, 3 ( 1 ) : 1- 15.
  • 2Sarkar K. Automatic single document text summarization using key con- cepts in documents[J]. Journal of information processing systers, 2013, 9 (4): 602- 620.
  • 3Guo X, Sun H, Zhou T, et al. SAW Classification Algorithm for Chinese Text Classification [J]. Sustainability, 2015, 7 (3): 2338 - 2352.
  • 4马海兵,毕久阳,郭新顺.文本分类方法在网络舆情分析系统中的应用研究[J].情报科学,2015,33(5):97-101. 被引量:8
  • 5Cabena P, Choi H H, Kim I S, et al. Intelligent Miner for Data Ap- plications Guide [J]. IBM RedBook SG24 - 5252 - 00, 1999.
  • 6Bird S. NLTK: the natural language toolkit [C] //Proceedings of the COLING/ACL on Interactive presentation sessions. Association for Computational Linguistics, 2006: 69- 72.
  • 7陈慧萍,林莉莉,王建东,苗新蕊.WEKA数据挖掘平台及其二次开发[J].计算机工程与应用,2008,44(19):76-79. 被引量:35
  • 8Qiu x, Zhang Q, Huang X. FudanNLP: A Toolkit for Chinese Natu- ral language Processing [ C ] //ACL ( Conference System Demonstra- tions), 2013: 49-54.
  • 9NLPIR汉语分词系统[EB/OL].http://ictclas.nlpir.org,2015-05- 24.
  • 10DRAP文本分类系统简介[EB/OL].http://www.searchfonum.3rg.cn/tansongbo/software.htm.2015- 05- 24.

二级参考文献73

共引文献50

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部