期刊文献+

基于混合字词网格的汉语音字转换问题的求解 被引量:5

Solving the Pinyin-to-Chinese-Character Conversion Problem Based on Hybrid Word Lattice
下载PDF
导出
摘要 汉语音字转换是中文键盘输入、汉语语音识别和中文信息处理的基础,也是一个非常具有挑战性的问题.文中分析了汉语音字转换的研究现状和存在的问题,提出了基于混合字词网格的汉语音字转换方法,给出了系统实现的架构,研究了混合2-gram模型的有关问题以及字词网格的求解算法,最后讨论了自动预测与系统学习功能的实现.在此基础上设计了原型系统并与Windows XP上的微软拼音输入系统进行了比较,在拼音到汉字的自动转换正确率方面有显著的提高. The research and development of the Pinyin-to-Chinese-Character conversion is the core technique of Chinese Input system, Chinese speech recognition and Chinese information processing. First, the state-of-the-art of Pinyin-to-Chinese-Character conversion is briefly discussed, and its principles and shortcomings are analyzed. Then the conversion approach based on hybrid word lattice is proposed. The implementation of the main architecture is studied. The related problems with hybrid language model and the algorithms to solve the word lattice are investigated. Finally, the automatic prediction algorithm and the machine learning technology used in Chinese intelligent input systems are discussed. A prototype system realized based on the proposed approach is presented, and compared with the MS Pinyin input system in Windows XP. The experimental results show that the correct conversion rate from Pinyin to Chinese characters is significantly improved.
作者 章森
出处 《计算机学报》 EI CSCD 北大核心 2007年第7期1145-1153,共9页 Chinese Journal of Computers
基金 本课题得到国家自然科学基金(60572125)资助
关键词 汉语音字转换 N-GRAM语言模型 MARKOV模型 字词网格 用户行为 Pinyin-to-Chinese-Character conversion n-gram language model Markov model word lattice user's action
  • 相关文献

参考文献10

二级参考文献44

共引文献60

同被引文献46

引证文献5

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部