摘要
在汉语语音检索研究中,为充分利用汉语中音节相互搭配的语言学知识,提出了一种新的汉语语言模型构造基元——"词片"(word fragment),研究了最佳词片选择算法。汉语语音识别实验和语音检索实验表明,采用基于词片的语音模型后,音节正确率有所提高,并取得了更好的语音检索性能。
A new unit, named word fragment of language model was proposed to take full advantage of the Chinese linguistic information among adjacent syllables, and an algorithm for word fragment selection was studied. The experimental results show, with the language model based on word fragment, syllable accuracy for recognizer is improved and the speech retrieval system gives better performance than the one with only syllable based model.
出处
《通信学报》
EI
CSCD
北大核心
2009年第3期84-88,共5页
Journal on Communications
基金
国家重点基础研究发展计划("973"计划)基金资助项目(2007CB311100)
国家自然科学基金资助项目(60575030)~~
关键词
汉语语音检索
语言模型
词片
互信息
Chinese speech retrieval
language model
word fragment
lattice