摘要
本文讨论了自然语言理解中的语音流和文字流的自动分词问题;构造了汉语理解的层次化模型;提出了把反馈信息限定为最简形式从而使分词层与语义无关的思想以及词串排序的三种策略:按可能性大小排序,按运算时间长短排序,以及上述两种的综合;介绍了一种分词精度极高的分词方法FWF;并且给出了实现算法和实验结果。FWF分词方法已经在语句级键盘输入、声音输入、手写汉字输入系统上使用。
This paper discusses the problem of Separating Syllables and Characters into Words (SSCW) in natural language, and constructs a model of natural language uaderstand-ing in order to define the action or level of SSCW in natural language understanding, and then presents an idea to make the feedback information simpest so that SSCW has no relationship with the meaning of words in order to get rid of the awkward circle of cause and effect.SSCW consists of word matching and word string arranging. There are three strategies to arrange word string. 1. in possibility order, 2. in runing time order, 3. a compromise of two strategies mentioned above. So far all the popular algorithms of SSCW can rerult from the second strategy,and the best one rerulted from steategy 3 is given in this paper.
出处
《中文信息学报》
CSCD
1991年第3期48-58,共11页
Journal of Chinese Information Processing
基金
国家八六三高技术资助