期刊文献+

基于动态贝叶斯网络的语音识别及音素切分研究 被引量:2

Research on DBN-based continuous speech recognition and phoneme segment
下载PDF
导出
摘要 研究了一种基于动态贝叶斯网络(dynamic bayesian networks,DBN)的语音识别建模方法,利用GMTK(graphical model tool kits)工具构建音素级音频流DBN语音训练和识别模型,同时与传统的基于隐马尔可夫的语音识别结果进行比较,并给出词与音素的切分结果。实验表明,在各种信噪比测试条件下,基于DBN的语音识别结果与基于HMM的语音识别结果相当,并表现出一定的抗噪性,音素的切分结果也比较准确。 This paper described a dynamic Bayesian network (DBN) based technique on continuous speech recognition. The word recognition accuracies and phoneme segment accuracies of the DBN based system ( implemented using the graphical model tool kit) were compared with those from classical HMM. Results show that under various SNRs, DBN based system and HMM based system has similarity performance for speech recognition and phoneme segment, especially in much lower SNR circumstance, DBN get even much better performance than HMM.
出处 《计算机应用研究》 CSCD 北大核心 2007年第10期104-106,127,共4页 Application Research of Computers
基金 西北工业大学基金资助项目(04XD0102) 中国科技部与比利时弗拉芒大区科技合作资助项目(国科外函[2004]487)
关键词 动态贝叶斯网络 图模型 图模型工具包 DBN GM(graphical models) GMTK
  • 相关文献

参考文献9

  • 1POTAMIANOS G,NETI C,GRAVIER G,et al.Recent advances in the automatic recognition of audiovisual speech[J].IEEE,2003,91(9):1306-1326.
  • 2MUKUNDH N,SREEIVAS T V.Product-HMM:a novel class of HMMs for sub-sequence modeling[EB/OL].(2003-01-09).[2006-06-04].http://www.isca-speech.org/orchive/wslp-117.html.
  • 3HAGEN A,MORRIS A C.Recent advances in the multi-stream HMM/ANN hybrid approach to noise robust ASR[J].Computer Speech & Language,2005,19(1):3-30.
  • 4BILMES J.GMTK:the graphical models toolkit[EB/OL].[2006-06-04].http://ssli.ee.washington.edu/-bilmes/gmtk/doc.pdf.
  • 5BILMES J A,CHRIS B.Graphical model architectures for speech recog-nition[J].IEEE Signal Processing,2005,22(5):89-100.
  • 6KEVIN P M.Dynamic Bayesian networks:representation,inference and learning[D].Berkeley:University of California,2002.
  • 7ZHANG Yi-min,DIAO Qian,et al.DBN based multi-stream models for speech[C]//Proc of IEEE Int Conference on Acoustics,Speech,and Signal Processing.2003:836-839.
  • 8ZWEIG G,RUSSELL S.Speech recognition with dynamic Bayesian networks[C]//Proc of the 15th Nat Conf Artificial Intelligence and 10th Innovative Applications of Artificial Intelligence Conf(AAAI-'98).1998:173-180.
  • 9RUASSELL S,NOORVIG P.人工智能:一种现代方法.[M].中文版.北京:人民邮电出版社,2004:430-437.

同被引文献17

  • 1张东滨,杜利民.语音识别的自适应束剪枝方法[J].电声技术,2004,28(8):41-45. 被引量:4
  • 2吴志勇,蔡莲红.基于动态贝叶斯网络的音视频双模态说话人识别[J].计算机研究与发展,2006,43(3):470-475. 被引量:11
  • 3黄昆.嵌入式,语音识别技术新趋向[J].中国计算机用户,2006(45):46-46. 被引量:1
  • 4LIVESCU K, CETIN O, HASEGAWA-JOHNSON M, et al. Articulatory feature-based methods for acoustic and audio-visual speech recognition: summary from the 2006 JHU Summer workshop [ C ]//Proc of IEEE International Conference on Acoustics, Speech, and Signal Processing. 2007 : 621- 624.
  • 5GOWDY J N, SUBRAMANYA A, BARTELS C. DBN based multistream models for audio-visual speech recognition [ C ]//Proc of IEEE International Conference on Acoustics, Speech, and Signal Processing. 2004:993- 996.
  • 6BILMES J. GMTK: the graphical models toolkit[ EB/OL]. [ 2006- 06-04]. http://ssli. ee. washington. edu/- bilmes/gmtk/doc. pdf.
  • 7ZHOU Yi, GU Lie, ZHANG Hong-jiang. Bayesian tangent'shape model: estimating shape and pose parameters via Bayesian inference [ C ]// Proc of IEEE Conference on Computer Vision and Pattern Recognition. 2003.
  • 8BILMES J A, CHRIS B. Graphical model architectures for speech recognition [ J]. IEEE Signal Processing ,2005,22 (5) :89- 100.
  • 9LIVESCU K, GIASS J. Feature-based pronunciation modeling with trainable asynchrony probabilities[ C]//Proc of International Conference on Spoken Language Processing. 2004.
  • 10Forney G D. The Viterbi algorithm [J]. Proceedings of the IEEE, 1973, 61(3) : 268-278.

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部