摘要
传统的自然语言处理模式是“语法规则+词典”,但语言中许多词语组合不能或很难用语法规则加以描述。把这些组合作为整体收入词库中可以使语法得到简化,从而降低系统的复杂度。基于这种“大词库,小语法”的思想,本文论述了建立现代汉语短语信息库的必要性,并对建库的方法、收录的原则和信息库进行了简要的介绍。
Abstract The language knowledge encoded in the natural language processing system is normally composed of two parts : lexicon and rules . Traditionally , the lexicon is considered to be a collection of words . But many word combinations can not be interpreted or generated according to this modal . Therefore , it will help to improve the robustness of the system and simplify the rule system to extend the lexicon to include the fixed and semi-fixed phrases . This paper dis- cusses the necessity of building a large phrasal lexicon from several aspects of language acquisi- tion and natural language processing . The main principles , methods and information descrip- tions are also outlined in this paper .
出处
《术语标准化与信息技术》
1998年第2期26-31,共6页
Terminology Standardization & Information Technology