期刊文献+

基于三音素动态贝叶斯网络模型的大词汇量连续语音识别 被引量:3

Continuous Speech Recognition for Large Vocabulary Based on Triphone DBN Model
下载PDF
导出
摘要 考虑连续语音中的协同发音现象,基于词-音素结构的DBN(WP-DBN)模型和词-音素-状态结构的DBN(WPS-DBN)模型,引入上下文相关的三音素单元,提出两个新颖的单流DBN模型:基于词-三音素结构的DBN(WT-DBN)模型和基于词-三音素-状态的DBN(WTS-DBN)模型。WTS-DBN模型是三音素模型,识别基元为三音素,以显式的方式模拟了基于三音素状态捆绑的隐马尔可夫模型(HMM)。大词汇量语音识别实验结果表明:在纯净语音环境下,WTS-DBN模型的识别率比HMM,WT-DBN,WP-DBN和WPS-DBN模型的识别率分别提高了20.53%,40.77%,42.72%和7.52%。 To avoid coarticulatory effects in continuous speech recognition, based on word- phone structure dynamic bayesian network (WP-DBN) model and word-phone-state structure DBN (WPS-DBN) model, context-dependent triphone units are introduced. Two novel single stream DBN models, that is, word-triphone structure DBN (WT-DBN) and'word-triphone-state structure DBN (WTS-DBN) models, are proposed for continuous speech recognition. WTS-DBN model is a triphone model and its modeling unit is triphone. It simulates a conventional HMM (hidden markov model) based triphone state-tying. Experimental results in large-vocabulary and clean speech environment show that the speech recognition rates of WTS-DBN model increase 20.53%, 40.77%, 42.72% and 7.52% than those of the HMM, WT-DBN, WP-DBN and WPS-DBN models.
出处 《数据采集与处理》 CSCD 北大核心 2009年第1期1-6,共6页 Journal of Data Acquisition and Processing
基金 中国博士后基金(20080431251)资助项目 国家"八六三"高技术研究发展计划(2007AA01Z324)资助项目
关键词 语音识别 动态贝叶斯网络 三音素 音素 speech recognition dynamic Bayesian network triphone phone
  • 相关文献

参考文献8

  • 1Murphy K. Dynamic Bayesian networks:representation,inference and learning[D]. Berkeley: University of California, 2002.
  • 2Bilmes J, Zweig G. The graphical modelds toolkit: an open source software system for speech and timeseries processing[C]//Proceedings of the IEEE International Conf on Acoustic Speech and Signal Processing (ICASSP). OrLando, Florida, USA:[s. n.], 2002(4): 3916-3919.
  • 3Bilmes J, Bartels C. Graphical model architectures for speech recognition [J]. IEEE Signal Processing Magazine, 2005, 22(5): 89-100.
  • 4Zweig G. Speech recognition with dynamic Bayesian networks [D]. Berkeley: University of California, 1998.
  • 5Bilmes J, Zweig G, Richardson T, et al. Discriminatively structured graphical models for speech recognition: JHU-WS-2001 final workshop report [EB/OL]. http://www, clsp. jhu. edu/ws2001/ groups/gmsr/GMRO-final-rpt, pdf, Johns Hopkins Univ, Baltimore, MD, Tech Rep CLSP, 2001.
  • 6Lv Guoyun, Jiang Dongmei, Sahli H, et al. A novel DBN model for large vocabulary continuous speech recognition and phone segmentation [C]//International Conference on Artificial Intelligence and Pattern Recognition (AIPR-07). Orlando, USA.. [s. n.] 2007, 1:397-402.
  • 7Young S J, Odell J, Woodland P C. Tree-based state tying for high accuracy acoustic modeling [C]//Proceedings ARPA Workshop on Human Language Technology. Plainsboro, NJ, USA: [s. n. ].1994: 307-312.
  • 8Bilmes J. GMTK: the graphical models toolkit[EB/ OL]. http://ssli, ee. washington, edu/-bilmes/ gmtk/, 2002.

同被引文献30

  • 1荣传振,岳振军,贾永兴,王渊,杨宇.唇语识别关键技术研究进展[J].数据采集与处理,2012,27(S2):277-283. 被引量:4
  • 2赵世奇,张宇,刘挺,陈毅恒,黄永光,李生.基于类别特征域的文本分类特征选择方法[J].中文信息学报,2005,19(6):21-27. 被引量:21
  • 3盛骤,谢式千,潘乘毅.概率论与数理统计[M].北京:高等教育出版社,2010.
  • 4MitchellTM著 曾华军 张银奎译.机器学习[M].北京:机械工业出版社,2003..
  • 5Sebastiani F. Machine learning in automated text cat- egorization[J]. ACM Computing Surveys, 2002, 34 (1) : 1-9.
  • 6Finn A, Kushmeick N, Smyth B. Genre classifica- tion and domain transfer for information filtering[C] //Proceedings of the 24th BCS-IRSG European Col- loquium on Information Retrieval Research.. Ad- vances in Information Retrieval. UK.. Springer, 2002: 353-362.
  • 7Yu H, Hatzivassiloglou V. Towards answering opin- ion questions: Separating facts /rom opinions and i- dentifying the polarity of opinion sentences [C]// Proceedings of the 2003 Conference on EMNLP. USA: ACL, 2003: 129-136.
  • 8Pang B, Lee L. A sentimental education: sentiment analysis using subjectivity summarization based on minimum cuts [C] // Proceedings of the 42nd Annual Meeting of the Association for Computational Lin- guistics. Morristown, NJ, USA.. ACL, 2004.. 271- 278.
  • 9中国科学院计算技术研究所.ICTCLAS特色[EB/OL].http://ictclas.org/index.html,2008/2013.InstituteofComputingTechnology.ICTCLAS[EB/OL].http://ictclas.org/index.html,2008/2013.
  • 10Rosenfeild R. Two decades of statistical language modeling: Where do we go from here? [J]. Proceedings of the IEEE, 2000, 88(8): 1270-1278.

引证文献3

二级引证文献56

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部