期刊文献+

基于3维空间Viterbi算法的音素模型和声调模型识别概率统合方法的研究 被引量:3

Study on the integration of phonetic and prosodic probability based on 3-dimension viterbi search
下载PDF
导出
摘要 提出了一种在汉语连续语音识别中基于 3维空间 Viterbi算法的音素模型和声调模型识别概率的统合方法。该方法采用60个音素单位的HMM和8个声调单位的HMM作为识别用基元模型。音素和声调基元模型识别结果的统合,采用音素的HMM状态、声调的HMM状态和时间的3 维空间帧同步Viterbi 算法来实现。本文还探讨了在该方法的基础上,给予不同路径限制时的匹配统合效果,并且通过和传统的匹配统合方式的比较,证明了提出的方法的有效性。 This paper presents a new method of continuous speech recognition for Chinese, in which phonetic and prosodic features were integrated in terms of 3-Dimension Viterbi search.The phonetic information was modeled as 60 phonemic HMMs and 11 tone HMMs of the prosodic information. Both models are synchronized based on 3-Dimension Viterbi search. We investigated integration methods of phonetic and prosodic likelihoods based on different at search paths and compared them with traditional method through the experiments on continuous speech recognition of Chinese. The efficiency of the proposed approach is verified in this paper.
出处 《声学学报》 EI CSCD 北大核心 2001年第3期259-263,共5页 Acta Acustica
基金 国家自然科学基金资助项目!(批准号69871009)
  • 相关文献

参考文献8

  • 1赵力,邹采荣,吴镇扬.汉语连续语音识别中语音处理和语言处理统合方法的研究[J].声学学报,2001,26(1):73-78. 被引量:9
  • 2Zhao L,ICCOPOL'97,1997年,178页
  • 3Zhao L,日本音响学会论文志,1997年,53卷,12期,933页
  • 4Zhao L,IEICE Technical Report SP95 26,1995年,9页
  • 5Zhao L,IEICE,TRANS INF and SYST ED,1995年,78卷,6期,669页
  • 6Gao Y,Proc ICASSP,1995年,1期,77页
  • 7Chien L F,IEEE Trans SAP,1993年,1卷,2期,221页
  • 8Lei L,IEICE Technical Report SP90 105,1990年,90页

二级参考文献8

  • 1新美康永.音声认识[M].日本共立出版社,1987..
  • 2Zhao L,ICCCPOL'97,1997年,178页
  • 3Zhao L,日本音响学会论文志,1997年,53卷,12期,933页
  • 4Zhao L,IEICE Technical Report SP98 26,1995年,9页
  • 5Zhao L,IEICE TRANS INF SYST ED,1995年,78卷,6期,66页
  • 6Chien L F,IEEE Trans SAP,1993年,1卷,2期,221页
  • 7Lei L,IEICE Technical Report SP90 105,1990年
  • 8新美康永,音声认识,1987年

共引文献8

同被引文献40

  • 1王韫佳.音高和时长在普通话轻声知觉中的作用[J].声学学报,2004,29(5):453-461. 被引量:33
  • 2Pandey P C, Bhandorkar S M. Enhancement of alaryngeal speech using spectral subtraction. Digital Signal Processing, 2002; 12(2): 591-594
  • 3Zhong Lin, Rafik Goubran. Musical noise reduction in speech using two-dimensional spectrogram enhancement.Proceedings of HAVE, 2003; 20(5): 61-64
  • 4Tadj C, Gabrea M. Towards robustness in speaker verification: Enhancement and adaptation. Midwest Symposium on Circuits and Systems, 2002; 3(3): 320-323
  • 5Soon I Y, Koh S N. Speech enhancement using 2-D Fourier transform. IEEE Transactions on Speech and Audio Processing, 2003; 11(6): 717-724
  • 6Douglas Reynolds A. Speaker identification and verification using Gaussian mixture speaker models. Speech Communication, 1995; 17(1): 91-108
  • 7Matsui T, Furui S. Concatenated phoneme models for text variable speaker recognition. ICASSP. 1993; 2(2): 391-394
  • 8Markov K, Nakagawa S.Text-independent speaker recognition system using frame level likelihood processing. Technical Report of IEICE, 1996; 96(17): 37-44
  • 9Ke Chen. Towards better making a decision in speaker verification. Pattern Recognition, 2003; 36(2) : 329-346
  • 10Reynolds D A, Rose R C. Robust text-independent speaker identification using Gaussian mixture speaker models.IEEE Trans. On Speech and Audio Processing, 1995; 3(1):72-83

引证文献3

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部