摘要
在分析语音识别系统中,基于规则方法和统计方法的语言模型,提出了一种对规则进行量化的合成语言模型.该模型既避免了规则方法无法适应大规模真实文本处理的缺点,同时也提高了统计模型处理远距离约束关系和语言递归现象的能力.合成语言模型使涵盖6万词条的非特定人孤立词的语音识别系统的准确率比单独使用词的TRIGRAM模型提高了4.9%(男声)和3.5%(女声).
In this paper a hybrid language model integrating rule base grammar and Markov language model for speech recognition applications is described.This hybrid language model not only avoids the disadvantage of rule base grammar in processing very large real text but also has a good performance in processing Chinese language recursive nature and long distance constrained relations,which has been applied to large vocabulary isolated work speech recognition.The male voice recognition accuracy is improved from 81 7% with Trigram language model to 86 6%,the female recognition accuracy from 87 7% to 91 2%.
出处
《自动化学报》
EI
CSCD
北大核心
1999年第3期309-315,共7页
Acta Automatica Sinica
基金
国家"八六三"高技术计划
霍英东基金
关键词
语音识别
统计语言模型
马尔可夫模型
词网格
Speech recognition,statistical language model,Markov model,word lattice.