期刊文献+

自然语言处理技术的三个里程碑 被引量:20

Milestones of natural language processing technology
原文传递
导出
摘要 半世纪以来自然语言处理 (NLP)研究取得两点重要认识和三大重要成果 ,即认识到 :(1 )对于句法分析 ,基于单一标记的短语结构规则是不充分的 ;(2 )短语结构规则在真实文本中的分布呈现严重扭曲。换言之 ,有限数目的短语结构规则不能覆盖大规模语料中的语法现象。这与原先的预期大相径庭。NLP技术的发展在很大程度上受到这两个事实的影响。从这个意义上说 ,本领域中称得上里程碑式的成果是 :(1 )复杂特征集和合一语法 ;(2 )语言学研究中的词汇主义 ;(3 )语料库方法和统计语言模型。大规模语言知识的开发和自动获取是NLP技术的瓶颈问题。因此 。 This paper is a brief discussion of the major findings and developments in the field of Natural Language Processing (NLP) in the past 50 years. First, the corpus investigation has shown the following two facts:(1) Single labeled PSG rules are not sufficient for natural language description, and (2) PSG rules have skew distribution in text corpora, i.e. the total number of PSG rules does not seem to be able to cover the language phenomena found in a large corpus, which is out of most linguists' expectation. The development of NLP technology has been under the influence of the two facts mentioned above. And there have been three major breakthroughs and milestones in this field: (1)multiple features and unification based grammars, (2)lexicalism in linguistics research, (3)Statistical Language Modeling (SLM) and corpus based approaches. The latest investigations reveal that the bottleneck problem in the NLP technology is the problem of obtaining and developing large scale linguistic knowledge; therefore, the corpus construction and statistical learning theory become key issues in NLP research and application.
机构地区 微软亚洲研究院
出处 《外语教学与研究》 CSSCI 北大核心 2002年第3期180-187,共8页 Foreign Language Teaching and Research
  • 相关文献

参考文献1

共引文献10

同被引文献232

引证文献20

二级引证文献144

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部