期刊文献+

Incorporating Linguistic Rules in Statistical Chinese Language Model for Pinyin-to-character Conversion 被引量:2

Incorporating Linguistic Rules in Statistical Chinese Language Model for Pinyin-to-character Conversion
下载PDF
导出
摘要 An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods such as MI-based rule evaluating, weighted rule quantification and element-based n-gram probability approximation are presented. Dynamic Viterbi algorithm is adopted to search the best path in lattice. To strengthen the model, transformation-based error-driven rules learning is adopted. Applying proposed model to Chinese Pinyin-to-character conversion, high performance has been achieved in accuracy, flexibility and robustness simultaneously. Tests show correct rate achieves 94.81% instead of 90.53% using bi-gram Markov model alone. Many long-distance dependency and recursion in language can be processed effectively. An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods such as MI-based rule evaluating, weighted rule quantification and element-based n-gram probability approximation are presented. Dynamic Viterbi algorithm is adopted to search the best path in lattice. To strengthen the model, transformation-based error-driven rules learning is adopted. Applying proposed model to Chinese Pinyin-to-character conversion, high performance has been achieved in accuracy, flexibility and robustness simultaneously. Tests show correct rate achieves 94.81% instead of 90.53% using bi-gram Markov model alone. Many long-distance dependency and recursion in language can be processed effectively.
出处 《High Technology Letters》 EI CAS 2001年第2期8-13,共6页 高技术通讯(英文版)
基金 theHighTechnologyResearchandDevelopmentProgrammeofChina
关键词 Chinese Pinyin-to-character conversion Rule-based language model N-gram language model Hybrid language model Element lattice Transformation-based error-driven learning Chinese Pinyin-to-character conversion, Rule-based language model, N-gram language model, Hybrid language model, Element lattice, Transformation-based error-driven learning
  • 相关文献

参考文献1

  • 1Xiao Guozheng Trends of (applied linguistics in the Century intersect-Summary of Symposium on "present and forecast of applied linguistics" in China Central Normal University)Zhang Zhigong,Wang Benhua on ("Application")Liu Yinglin Some Problems on ("Scal.APPLIED LINGUISTICS No.4 1995 Main Artcies[J].语言文字应用,1995(4):113-113. 被引量:23

共引文献22

同被引文献7

引证文献2

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部