期刊文献+

基于Web的语料库建设 被引量:2

A Preliminary Research on the Construction of Web Corpus 
下载PDF
导出
摘要 对网上中文信息语料库搜集技术的实现原理和关键技术进行了讨论和分析,介绍了基于Web网络的通讯及网上自动获取信息的原理,讨论了中文信息处理中的分词技术及其发展,提出了一个网上《人民日报》语料库搜集技术的实现方案. With the internet getting increasingly popular in China and the, information in Chinese on WWW becoming ever greater in volume, the importance of automatic data search technique in the Chinese information corpus on the line is more obvious than ever. The development and improvement of the technique is of great significance for bettering the process level of information in Chinese. The present paper, based on a discussion and analysis;of the realization laws and essential technology of data search technique in the Chinese information corpus, attempts to introduce the principles of realizing net communication at the Web and obtaining automatically the information on the line. A scheme of search technique for the corpus of People, s Daily is suggested with the classification and combination technology so far developed in the process of information in Chinese discussed and analyzed.
出处 《常熟高专学报》 2000年第2期81-85,共5页 Journal of Changshu College
关键词 WEB 语料库 分词 中文信息处理 搜集技术 Web Curpus dividing words
  • 相关文献

同被引文献13

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部