期刊文献+

书面汉语全切分中的重复切分研究

Research on Repetitive Word Segmentation Existed in Omni-Word Segmentation for Written Chinese
下载PDF
导出
摘要 针对书面汉语全切分中普遍存在的重复切分问题进行了研究.首先给出了重复切分的定义,然后分析指出切分歧义是引起重复切分的必然原因,从而使得重复切分的存在具有必然性和普遍性,另外讨论了两种可供选择的克服重复切分的方案.最后,对重复切分在全切分中出现的几率及对切分时间的影响进行了实验.实验结果显示,重复切分约占全切分的87%,消除重复切分后全切分的切分时间比消除前节省约84%. This paper gave a research on repetitive word segmentation that existed universally in omni-word segmentation for written Chinese. Firstly this paper gave the definition of the repetitive word segmentation, then pointed out that word segmentation ambiguity was the necessary reason causing it and making it existed universally and inevitably in Omni-word segmentation. Furthermore, this paper discussed two alternative methods to overcome the repetitive word segmentation, and finally gave an experiment on its proportion and influence on Omni-word segmentation. The result proved a ratio of 87% in proportion and a decrease of 84% in segmentation time.
出处 《小型微型计算机系统》 CSCD 北大核心 2006年第3期520-523,共4页 Journal of Chinese Computer Systems
关键词 全切分 重复切分 自然语言处理 omni-word-segmentation repetitive word segmentation NLP
  • 相关文献

参考文献5

二级参考文献10

共引文献62

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部