期刊文献+

Tone model integration based on discriminative weight training for Putonghua speech recognition

Tone model integration based on discriminative weight training for Putonghua speech recognition
原文传递
导出
摘要 A discriminative framework of tone model integration in continuous speech recognition was proposed. The method uses model dependent weights to scale probabilities of the hidden Markov models based on spectral features and tone models based on tonal features. The weights are discriminatively trained by minimum phone error criterion. Update equation of the model weights based on extended Baum-Welch algorithm is derived. Various schemes of model weight combination are evaluated and a smoothing technique is introduced to make training robust to over fitting. The proposed method is ewluated on tonal syllable output and character output speech recognition tasks. The experimental results show the proposed method has obtained 9.5% and 4.7% relative error reduction than global weight on the two tasks due to a better interpolation of the given models. This proves the effectiveness of discriminative trained model weights for tone model integration. A discriminative framework of tone model integration in continuous speech recognition was proposed. The method uses model dependent weights to scale probabilities of the hidden Markov models based on spectral features and tone models based on tonal features. The weights are discriminatively trained by minimum phone error criterion. Update equation of the model weights based on extended Baum-Welch algorithm is derived. Various schemes of model weight combination are evaluated and a smoothing technique is introduced to make training robust to over fitting. The proposed method is ewluated on tonal syllable output and character output speech recognition tasks. The experimental results show the proposed method has obtained 9.5% and 4.7% relative error reduction than global weight on the two tasks due to a better interpolation of the given models. This proves the effectiveness of discriminative trained model weights for tone model integration.
出处 《Chinese Journal of Acoustics》 2008年第3期193-202,共10页 声学学报(英文版)
  • 相关文献

参考文献14

  • 1曹阳,黄泰翼,徐波.基于统计方法的汉语连续语音中声调模式的研究[J].自动化学报,2004,30(2):191-198. 被引量:9
  • 2章文义,朱杰,徐向华.利用声调提高中文连续数字串语音识别系统性能[J].上海交通大学学报,2004,38(2):185-188. 被引量:3
  • 3赵力,邹采荣,吴镇扬.基于3维空间Viterbi算法的音素模型和声调模型识别概率统合方法的研究[J].声学学报,2001,26(3):259-263. 被引量:3
  • 4Yang Cao,Shuwu Zhang,Taiyi Huang,Bo Xu.Tone Modeling for Continuous Mandarin Speech Recognition[J].International Journal of Speech Technology (-).2004(2-3)
  • 5Huang C H,Side F.Pitch tracking and tone features for mandarin speech recognition[].Proceedings of the th International Conference on AcousticsSpeech and Signal Processing.2000
  • 6WANG Huangliang,,Qian Yao,Soong F K,Han Jiqing.Noisy Chinese digit string speech recogni- tion based on tone modeling[].Acta Acoustica.2007
  • 7ZHANG Wen-yi,ZHU Jie,XU Xiang-hua (Dept. of Electronic Eng.,Shanghai Jiaotong Univ.,Shanghai 200030,China).Improving the Performance of Continuous Mandarin Digit String Recognition System by Using Tones[].Journal of Shanghai University.2004
  • 8Lei X,S M,Hwang M,Ostendorf Met al.Improved tone modeling for mandarin broadcast news speech recognition[].Proceedings of Interspeech(ICSLP).2006
  • 9Wang H L,Qian Y,Soong F K,Zhou J L et al.Improved Mandarin Speech Recognition by Lattice Rescoring with Enhanced Tone models[].Proceedings of International Symposium on Chinese Spoken Language Processing.2006
  • 10CAO Yang HUANG Tai-Yi XU Bo(Institute of Automation,Chinese Academy of Sciences,Beijing 100080).A Stochastically-Based Study on Chinese Tone Patterns in Continuous Speech[].Acta Automatica Sinica.2004

二级参考文献29

  • 1[1]Zhang J S, Hirose K. Anchoring hypothesis and its application to tone recognition of Chinese continuous speech acoustics [A]. Proc IEEE Int Conf Acoust,Speech, Signal Processing [C]. Istanbul, Turkey:ICASSP, 2000. 1419-1422.
  • 2[2]u Y, Hemmi K, Inoue K. A tone recognition of polysyllabic Chinese words using an approximation model of four tone pitch patterns[A]. Proc Industrial Electronics, Control and Instrumentation Proceeding[C]. Asilomar, Califormia, USA: IECON,1991. 2115-2119.
  • 3[3]Zhang G L, Zheng F, Wu W H. Tone recognition of Chinese continuous speech[A]. International Symposium on Chinese Spoken Language Processing[C].Beijing: ISCSLP, 2000. 207-210.
  • 4[4]Kobayashi H, Shimamura T. A weighted autocorrelation method for pitch extraction of noisy speech[A]. Proc IEEE Int Conf Acoust, Speech, Signal Processing[C]. Istanbul, Turkey: ICASSP, 2000.1307- 1310.
  • 5[5]Hemandez D H, Huici M E, Lorenzo G J. Combined algorithm for pitch detection of speech signals [J].Electronics Letters, 1995, 31 ( 5 ): 15 - 16.
  • 6[6]Samad S A, Hussain A, Low K F. Pitch detection of speech signals using the cross correlation technique[A]. Intelligent Systems and Technologies for the Next Millenium[C]. Kuala Lumpur Malaysia: TENCON, 2000. 283-286.
  • 7[7]Cherif A. Pitch and formants extraction algorithm for speech processing[A]. Proc IEEE Int Conf Electronics, Circuits and Systems[C]. Kaslik, Lebanon:ICECS, 2000. 595-598.
  • 8Huang Tai-Yi, Wang Cai-Fei, Yoh-Han Pao. Speech analysis for Chinese putonghua(Mandarin). In: Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing. Atlanta: IEEE Press, 1981.1 : 370- 373.
  • 9Wu Zong-Ji. Tone variation in Chinese language. Chinese Language, 1982,28(6): 439-449(in Chinese).
  • 10Lee Lin-Shan, Tseng Chiu-Yu, Ming Ouh-Young. The synthesis rules in Chinese text-to-speech system. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989,37(9) : 1309- 1320.

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部