摘要
词链现象是书面汉语自动分词的困难所在,本文针对词链现象的复杂性,提出了一种“生成——测试”分词法。这种方法以知识为基础,它通过词典的动态化、分词知识的分布化、分词系统和句法语义系统的协同工作等手段实现了词链的有效切分和汉语句子切分与理解(生成格结构)的并行。“生成——测试”方法反映了人的分词和理解过程。
Automatic Segmentation of writen Chinese is a difficult problem in chinese Language understanding. We think that its difficulty lies in the phenonmona of word chains (WCs) which are special for writen Chinese. Many kisds of existing methods of automatic segmentation have no ability to handle WCs. In order to handle the complexity of WCs, this artical describes a method of automatic segmentation so called 'Produce-Test' Which is knowledgd-based. In contrast to existing methods, this method has three special feature in the following:①dynamic dictionary; ②distributed store of various Kinds Knowledge useful for automatic segmention;③segmentation subsystem works with syntactic-semantic subsystem in cooperative way .Crimiration rate of chinese nomonym is 96 %
出处
《中文信息学报》
CSCD
1989年第4期42-49,共8页
Journal of Chinese Information Processing