期刊文献+

基于声调建模的带噪汉语数字串语音识别 被引量:2

Noisy Chinese digit string speech recognition based on tone modeling
下载PDF
导出
摘要 尝试利用声调信息来改善噪声下汉语数字串语音识别性能。为解决声调特征不连续问题,提出采用基于多空间概率分布的隐马尔可夫模型进行声调建模。简要分析噪声对声调特征提取的影响,论证了在带噪数字串语音识别中利用声调信息的可行性。实验结果显示,与不采用声调信息的方法相比,在5 dB到20 dB的测试数据上,所提方法可使错误率平均相对下降17.2%。这说明声调信息及所提建模方法对于改善带噪汉语数字串语音识别性能是有效的。 It is attempted to utilize tone information to improve the performance of noisy Chinese digit string speech recognition. Multi-space probability distribution based HMM (MSD-HMM) is used to model the discontinuous tone features. The effect of noisy environment on tone features is analyzed and the feasibility of utilizing tone information to improve noisy speech recognition is discussed. Experimental results show that the proposed method can averagely obtain 17.2% relative reduction of digit error rate for the noisy data SNR from 5 dB to 20 dB, comparing with the method without tone information. The study concludes that it is effective to apply MSD-HMM based tone model to enhancing noisy Chinese digit string speech recognition.
出处 《声学学报》 EI CSCD 北大核心 2007年第5期454-460,共7页 Acta Acustica
基金 国家自然科学基金(60575030)
关键词 汉语数字串 声调特征 语音识别 建模方法 隐马尔可夫模型 识别性能 不连续问题 概率分布 Feature extraction Hidden Markov models Noise abatement Parameter estimation Probability distributions Signal to noise ratio
  • 相关文献

参考文献16

  • 1Chen C J, Gopinath R A, Monkowski M D, Picheny M A, Shen K. New methods in continuous mandarin speech recognition. In: Proc. of Eurospeech, 1997:1543-1546 Hirst D. and Espesser, R. Automatic
  • 2Modeling of fundamental frequency using a quadratic spline function. Travaux de l'Institut de Phonetique d'Aix 15, 1993:71-85
  • 3Tian Y, Zhou J L, Chu M, Chang E. Tone recognition with fractionized models and outlined features. In: Proc. of ICASSP, 2004:105-108
  • 4Qian Y. Use of Tone information in cantonese LVCSR based on generalized character posterior probability decoding. PhD. Thesis, CUHK, 2005
  • 5Tokuda K, Masuko T, Miyazaki N, Kobayashi T. Multispace probability distribution HMM. IEICE Trans. Inf. & Syst., 2002; E85-D(3): 455-464
  • 6Wang H L, Qian Y, Soong F K, Zhou J L, Han J Q. A Multi-Space Distribution (MSD) approach to speech recognition of tonal languages. In: Proc. of ICSLP, 2006: 1047-1050
  • 7张家禄 齐士钤 宋美珍 等.汉语声调在言语可懂度中的重要作用.声学学报,1981,7:237-237.
  • 8Chen S H, Wang J F. Noise-robust pitch detection method using wavelet transform with aliasing compensation. IEE Proceedings of Vision, Image and Signal Processing, 2002; 149(6): 327-334
  • 9张红,张红,黄泰翼,宋俊寿.一种频域基频提取新方法[J].声学学报,1999,24(4):438-445. 被引量:7
  • 10赵蕤,王作英.语音识别中信道和噪音的联合补偿[J].声学学报,2006,31(5):466-470. 被引量:11

二级参考文献21

  • 1张家騄.超音段特征间的相互作用[J].声学学报,1993,18(4):263-271. 被引量:3
  • 2张家騄.元音的内在基频与讲话方式对共振峰的影响[J].声学学报,1989,14(6):401-406. 被引量:6
  • 3国立新,莫福源,李昌立.基于连续高斯混合密度HMM的汉语全音节语音识别研究[J].声学学报,1995,20(5):321-329. 被引量:11
  • 4Qu F,4th National Conf Man-Machines Peech Communication,337页
  • 5Acero A, Stern R M. Environmental robustness in automatic speech recognition. In: Proc. IEEE Int. Conf.Acoustics, Speech and signal Processing, Albuquerque,NM, 1990; 1:849-852
  • 6Alejandro Acero. Acoustical and environmental robustness in automatic speech recognition. PH.D. Thesis. Department of Electrical and Computer Engineering CMU, AAT 9117502, 1990
  • 7Moreno P J. Speech recognition in noisy environments.PH.D. Thesis. Department of Electrical and Computer Engineering CMU, AAT 9625546, 1996
  • 8Kim D Y, Un C K, Kim N S. Speech recognition in noisy environments using first-order vector Taylor series. Speech Communication, 1998; 24(1): 39-49
  • 9Fujimoto, Masakiyo, Ariki, Yasuo. Robust speech recognition in additive and channel noise environments using GMM and EM algorithm. In: IEEE International Conference on Acoustics, Speech, and Signal Processing. Montreal, 2004; 1:I941-I944
  • 10Segura J C, Torre A de la, Benitez M C, Peinado A M.Model-based compensation of the additive noise for continuous speech recognition - experiments using AURORA Ⅱ database and tasks. EuroSpeech, 2001; 1:221-224

共引文献26

同被引文献36

  • 1王韫佳.音高和时长在普通话轻声知觉中的作用[J].声学学报,2004,29(5):453-461. 被引量:33
  • 2刘海滨,吴镇扬,赵力,曾毓敏.噪声环境下基于最大后验非线性变换的隐马尔可夫模型自适应算法[J].声学学报,2004,29(5):467-471. 被引量:4
  • 3孙暐,吴镇扬,刘海滨.非线性统计匹配用于子带鲁棒语音识别[J].电子与信息学报,2006,28(3):480-484. 被引量:4
  • 4赵蕤,王作英.语音识别中信道和噪音的联合补偿[J].声学学报,2006,31(5):466-470. 被引量:11
  • 5Kim W, Hansen J H L. Feature compensation in the cepstral domain employing model combination. Speech Com- munication, 2009; 51(2): 83-96.
  • 6Cui X, Alwan A. Noise robust speech recognition using feature compensation based on polynomial regression of utterance SNR. IEEE Trans. on Speech and Audio Processing, 2005; 13(6): 1161-1172.
  • 7Gauvain J L, Lee C H. Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. on Speech and Audio Processing, 1994; 2(2): 291-298.
  • 8Leggetter C J, Woodland P C. Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Computer Speech and Language, 1995; 9(2): 171-185.
  • 9Gales M J F, Woodland P C. Mean and variance adaptation within the MLLR framework. Computer Speech and Language, 1996; 10(4): 249-264.
  • 10Doh S J. Enhancements to transformation-based speaker adaptation: principal component and inter-class maximum likelihood linear regression. Carnegie Mellon University, 2000.

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部