期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Incorporating Linguistic Rules in Statistical Chinese Language Model for Pinyin-to-character Conversion 被引量:2
1
作者 刘秉权 Wang +2 位作者 Xiaolong Wang Yuying 《High Technology Letters》 EI CAS 2001年第2期8-13,共6页
An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods ... An N-gram Chinese language model incorporating linguistic rules is presented. By constructing elements lattice, rules information is incorporated in statistical frame. To facilitate the hybrid modeling, novel methods such as MI-based rule evaluating, weighted rule quantification and element-based n-gram probability approximation are presented. Dynamic Viterbi algorithm is adopted to search the best path in lattice. To strengthen the model, transformation-based error-driven rules learning is adopted. Applying proposed model to Chinese Pinyin-to-character conversion, high performance has been achieved in accuracy, flexibility and robustness simultaneously. Tests show correct rate achieves 94.81% instead of 90.53% using bi-gram Markov model alone. Many long-distance dependency and recursion in language can be processed effectively. 展开更多
关键词 Chinese pinyin-to-character conversion Rule-based language model N-gram language model Hybrid language model Element lattice Transformation-based error-driven learning
下载PDF
RESEARCH OF PINYIN-TO-CHARACTER CONVERSION BASED ON MAXIMUM ENTROPY MODEL 被引量:1
2
作者 Zhao Yan Wang Xiaolong Liu Bingquan Guan Yi 《Journal of Electronics(China)》 2006年第6期864-869,共6页
This paper applied Maximum Entropy (ME) model to Pinyin-To-Character (PTC) conversion in-stead of Hidden Markov Model (HMM) that could not include complicated and long-distance lexical informa-tion. Two ME models were... This paper applied Maximum Entropy (ME) model to Pinyin-To-Character (PTC) conversion in-stead of Hidden Markov Model (HMM) that could not include complicated and long-distance lexical informa-tion. Two ME models were built based on simple and complex templates respectively, and the complex one gave better conversion result. Furthermore, conversion trigger pair of y A → y B cBwas proposed to extract the long-distance constrain feature from the corpus; and then Average Mutual Information (AMI) was used to se-lect conversion trigger pair features which were added to the ME model. The experiment shows that conver-sion error of the ME with conversion trigger pairs is reduced by 4% on a small training corpus, comparing with HMM smoothed by absolute smoothing. 展开更多
关键词 pinyin-to-character (PTC) conversion Maximum Entropy (ME) model Hidden Markov Model(HMM) Conversion trigger pair Average Mutual Information (AMI)
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部