期刊文献+

基于网络检索的语料库软件系统评述 被引量:3

Commentary of Corpus Software System Based on Network Retrieval
原文传递
导出
摘要 基于网络检索的语料库研究均开始于语料库软件系统的开发,语料库软件系统是从事语料库语言学、机器翻译、语言教学、词典编纂等研究的基础,软件系统的质量决定了语料库建设规模的大小和研究成果的优劣。大规模语料库软件系统建设的关键环节包括:文档抽取;元数据建立;词性、句法和语误标注;索引、检索和统计分析。针对上述技术环节,我们收集并编程测试了大量国外语料库开发软件包,从软件实现的理论方法、执行效率、准确率、鲁棒性、实用性、支持中文等多个方面进行分析和评述,以期对国内大规模语料库软件系统的建设提供借鉴和帮助。 The study of corpus software system based on network retrieval was all launched out with the development of corpus software system. The corpus software system plays as the foundational stone in the building of the studies on corpus linguistics, machine translation, language teaching and lexicography.The system's quality formulates the scale of corpus construction and the outputs of the studies as well.The construction of large-scale corpus software system, whose key links include: document extraction;Metadata set up; the part of speech, syntax and miss labeling; indexing, retrieval and statistical analysis.According to the technologies above, we analyzed and commented the corpus development package fromvarious of aspects, like the theory method, execution efficiency, accuracy, robustness and practicability,weather support Chinese and so on, by means of a large amount of foreign corpus development packagecollection and programming tests. We do it for the reason that we may provide a reference or a little help for the construction of domestic large-scale corpus software system later on.
出处 《情报科学》 CSSCI 北大核心 2014年第11期147-151,共5页 Information Science
关键词 语料库 网络检索 语料库软件系统 语料标注 corpus net search corpus software system corpus tagging
  • 相关文献

参考文献20

  • 1李文中.语料库、学习者语料库与外语教学[J].外语界,1999(1):51-55. 被引量:157
  • 2Apache POI- the Java API for Microsoft Documents [EB/OL], http://poi.apache.org,2012-12-03.
  • 3[EB/OL].http://pdfbox.apache.org,2013-11-28.
  • 4[EB/OL].http://jtidy.sourceforge.net/,2009-12-01.
  • 5[EB/OL].http:// http://nekohtml.sourceforge.net/, 2013-10-09.
  • 6[EB/OL].http://tika.apache.org/,2013-07-01.
  • 7[EB/OL].http://nlp.stanford.edu/software/tagger.shtml, 2013-04-20.
  • 8[EB/OL].http://www.ims.uni-stuttgart.de/proj ekte/cor- plex/TreeTagger/DecisionTreeTagger.html,2012-04- 24.
  • 9[EB/OL].http://code.google.com/p/tt4j,2011-06-03.
  • 10[EB/OL]. http://www.lsi.upc.es/%7Enlp/SVMTool/,2013-10- 25.

二级参考文献4

共引文献158

同被引文献29

引证文献3

二级引证文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部