期刊文献+

Robust multi-stream speech recognition based on weighting the output probabilities of feature components 被引量:4

Robust multi-stream speech recognition based on weighting the output probabilities of feature components
原文传递
导出
摘要 In the traditional multi-stream fusion methods of speech recognition, all the feature components in a data stream share the same stream weight, while their distortion levels are usually different when the speech recognizer works in noisy environments. To overcome this limitation of the traditional multi-stream frameworks, the current study proposes a new stream fusion method that weights not only the stream outputs, but also the output probabilities of feature components. How the stream and feature component weights in the new fusion method affect the decision is analyzed and two stream fusion schemes based on the mariginalisation and soft decision models in the missing data techniques are proposed. Experimental results on the hybrid sub-band multi-stream speech recognizer show that the proposed schemes can adjust the stream influences on the decision adaptively and outperform the traditional multi-stream methods in various noisy environments. In the traditional multi-stream fusion methods of speech recognition, all the feature components in a data stream share the same stream weight, while their distortion levels are usually different when the speech recognizer works in noisy environments. To overcome this limitation of the traditional multi-stream frameworks, the current study proposes a new stream fusion method that weights not only the stream outputs, but also the output probabilities of feature components. How the stream and feature component weights in the new fusion method affect the decision is analyzed and two stream fusion schemes based on the mariginalisation and soft decision models in the missing data techniques are proposed. Experimental results on the hybrid sub-band multi-stream speech recognizer show that the proposed schemes can adjust the stream influences on the decision adaptively and outperform the traditional multi-stream methods in various noisy environments.
出处 《Chinese Journal of Acoustics》 2009年第3期269-279,共11页 声学学报(英文版)
基金 supported by the National Natural Science Foundation of China(60502041,60625101) Guangdong National Science Foundation(05300146).
  • 相关文献

参考文献5

二级参考文献37

  • 1齐士钤 吕士楠 等.汉语综合资料库的设计[J].应用声学,1994,13(3):1-5.
  • 2朱维彬.汉语言语数据库自动标注系统的研究.中国科学院声学研究所博士论文[M].,1998..
  • 3林茂灿.北京话声调分布域的感知实验研究.语音研究报告[M].中国社会科学院语言研究所语音研究室,1992..
  • 4Potamianos G, Neti C, et al.. Recent advances in the automatic recognition of audiovisual speech. Proc. IEEE, 2003, 91(9):1306- 1326.
  • 5Cootes T F, Taylor C J, et al., Active shape models-their training and application, Computer Vision and linage Understanding,1995, 12(1): 38 - 59.
  • 6Neti C, Potamianos G, Luettin J, et al.. Audio visual speech recognition. Final Workshop 2000 Report, Baltimore, USA, 2000:40- 41.
  • 7Rao C R, Linear Statistical Inference and Its Applications. New York, John Wiley and Sons, 1965:122 - 128.
  • 8Young S J, Kershaw D, Odell J, Woodland P. The HTK Book.http://htk.eng,cam.ac.uk/docs/docs.shtml, 2002.
  • 9Dupont S, Luettin J. Audio-visual speech modeling for continuous speech recognition. IEEE Trans. on Multimedia, 2000,2(3): 141 - 151.
  • 10朱维彬,博士学位论文,1998年

共引文献29

同被引文献12

引证文献4

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部