摘要
语料库语言学的主要研究内容之一是对所建立语料库中的语料,进行不同层次的语法分析。语法分析一般由两个层次组成,即词类分析(Part-of-Speech Analyzing)与句法分析(Parsing)。词类分析通常包括两个过程:(i)引入歧义(即,词法分析过程);(ii)消除歧义(即,排除非法选择的过程)。
A very important aspect of corpus linguistics is performing the part-of-speech analysis of the corpus and tagging it. This is the key for making the corpus machine-readable and improving the using value. In this paper,we will present the part-of-speech analysis method based on data-driven technic in more detail. We hope the introduction will be helpful to the other researchers in this field.
出处
《计算机科学》
CSCD
北大核心
1999年第1期69-74,共6页
Computer Science
基金
国家自然科学基金
国家教委博士点专项基金
关键词
语料
词类分析
语言信息处理
数据驱动
Corpus linguistics,Data-driven,Part-of-speech analysis ,Part-of-speech disambiguation