摘要
对汉语的特点和分词概念作了简单介绍,详细说明了常用的分词算法,在此基础上,提出了一种改进的MM中文分词算法。该算法兼顾了最大正向匹配法(MM)和逆向最大匹配法(RMM)的优点,克服他们的不足,使得切分准确率和分词效率均有明显的提高,是一种比较实用的分词算法。实验也进一步证明,该算法能有效地提高切分准确率和分词效率。
It introduces briefly the conception of word segmentation and characteristic of chinese, explains detailedly the method of ordinary word segmentation and puts forward an improved Maximum Matching Method (MM) for chinese word segmentation. This method is an applied method for chinese word segmentation, and it has the advantage Maximum Matching Method (MM) and Reverse Maximum Matcing Method (RMM) and overcomes their shortcomings. So it obtains obvious improvement for the exact probability and efficiency of Chinese word segmentation, it is proved through practices that this method can improve efficiently the exact probability and efficiency of Chinese word segmentation.
出处
《计算机与网络》
2009年第2期48-50,54,共4页
Computer & Network
关键词
自然语言处理
中文分词
改进的最大匹配法
natural language processing
Chinese word segmentation
improved maximum matching method