期刊文献+

一种改进的MM中文分词算法

An Improved Maximum Matching Method for Chinese Word Segmentation
下载PDF
导出
摘要 对汉语的特点和分词概念作了简单介绍,详细说明了常用的分词算法,在此基础上,提出了一种改进的MM中文分词算法。该算法兼顾了最大正向匹配法(MM)和逆向最大匹配法(RMM)的优点,克服他们的不足,使得切分准确率和分词效率均有明显的提高,是一种比较实用的分词算法。实验也进一步证明,该算法能有效地提高切分准确率和分词效率。 It introduces briefly the conception of word segmentation and characteristic of chinese, explains detailedly the method of ordinary word segmentation and puts forward an improved Maximum Matching Method (MM) for chinese word segmentation. This method is an applied method for chinese word segmentation, and it has the advantage Maximum Matching Method (MM) and Reverse Maximum Matcing Method (RMM) and overcomes their shortcomings. So it obtains obvious improvement for the exact probability and efficiency of Chinese word segmentation, it is proved through practices that this method can improve efficiently the exact probability and efficiency of Chinese word segmentation.
出处 《计算机与网络》 2009年第2期48-50,54,共4页 Computer & Network
关键词 自然语言处理 中文分词 改进的最大匹配法 natural language processing Chinese word segmentation improved maximum matching method
  • 相关文献

参考文献5

二级参考文献21

  • 1吴军,王作英,禹锋,王侠.汉语语料的自动分类[J].中文信息学报,1995,9(4):25-32. 被引量:24
  • 2卜东波.聚类/分类理论研究及其在文本挖掘中的应用.中科院计算所博士学位论文[M].-,2000..
  • 3INMON W H.Building the Data Warehouse[M].John Wiley & Sons,1996.23-26.
  • 4AGRAWAL R.Fast algorithms for mining association rules in large databases[M].Proc.of VLDB,Santiago,1994.487-499.
  • 5Yang Yiming,ProceedingsoftheSeventeenthInternationalACMSIGIRConferenceonResearchandDevelopme,1994年,12页
  • 6STRUMJ.Microsoft SQL Server 7数据仓库技术指南[M].北京:机械工业出版社,2000.74-76.
  • 7B. Liu, W. Hsu, and Y. Ma. Integrating Classification and Association Rule Mining [C]. KDD - 98, New York,1998.
  • 8Wenmin Li, Jiawei Han, JianPei. CMAR: Accurate and Efficient Classification Based on Multiple Class-Association Rules [C] .ICDM2001, Silicon Valley, Ca, Nov 2001:369- 376.
  • 9Maria-Luiza Antonie, Osmar R. Zaiane. Text Document Categorization by Term Association [C]. In: Proc of the IEEE International Conference on Data Mining (ICDM 2002), Maebashi City, Japan: 19 - 26.
  • 10Mohammed J. Zaki, Charu C. Aggarwal. XRules: An Effective Structural Classifier for XML Data [C]. The Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(SIGKDD). Washington, DC,USA, 2003.

共引文献188

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部