期刊文献+

噪声环境下基于单高斯模型的声道归一化研究

The Study of Vocal Tract Length Normalization based on Single Mixture in Noisy Environment
下载PDF
导出
摘要 声道归一化是语音识别中说话人自适应的方法之一,在噪声环境下对其进行了研究并做了一系列的实验。在实现过程中,首次在噪声环境下采用了基于单高斯混合模型选择弯折因子的方法,并取得了良好的结果。实验基于AURORA语音数据库,并用其所带的汽车噪声环境下的测试集对模型进行了识别验证。实验结果表明,采用声道归一化后的识别结果在各个噪声下均比原来有不同程度的改善,迭代训练能改进单轮声道归一化的结果,最佳结果出现在迭代训练的第三轮。噪声环境下基于一个高斯混合模型选择的弯折因子相比其他高斯混合模型选择的弯折因子,句子平均识别率提高了近1.68%。经过声道归一化后的性别独立模型的识别结果能接近于未经声道归一化后的性别依赖模型的识别结果,如果训练数据充分,声道归一化后的性别独立模型的识别结果能更好。 Vocal tract length normalization is one of speaker adaptation in speech recognition.In this paper,we focus on the study of it and do a series of experiments.In its realization,we firstly adopt the means on scale factor which is based on single mixture in noisy environment and reach the better result.The experiments are based on AURORA speech database.We recognize the models using the test set in noisy car environment which is included in AURORA speech database.The results show that in various noise the recognized results of the VTLN are better than those of no VTLN.Iterative training can improve the performance of single turn VTLN and the optimal result is in third turns.In noisy environment,the average sentence correction based on the scale factor of single mixture is improved more 1.68 percent than that of the other mixtures. The gender independent performance of no VTLN is close to the gender dependent performance of VTLN.If the training data is sufficent,the gender independent performance of VTLN is better.
出处 《微处理机》 2006年第5期102-105,共4页 Microprocessors
关键词 声道归一化 语音识别 说话人自适应 Vocal tract Length normalization Speech recognition Speaker adaptation
  • 相关文献

参考文献9

  • 1[1]SYoung et al.The HTK book" Cambridge University Engineering Department[DB/OL].2002:62-63.
  • 2[2]H G Hirsch & D Pearce.The AURORA Experimental Framework for the Performance Evaluations of Speech Recognition Systems under Noisy Conditions[DB/OL].ISCA ITRW ASR2000 "Automatic Speech Recognition:Challenges for the Next Millennium"; Paris,2000:18-20.
  • 3[3]L Welling,S.Kanthak,H.Ney.Improved Methods for Vocal Tract Normalization[DB/OL].Proc.IEEE International Conference on Acoustics,Speech and Signal Processing,Phoenix,Arizona,USA,1999:761-764.
  • 4[4]L Welling,R.Haeb-Umbach,X.Aubert,N.Haberland.A Study on Speaker Normalization using Vocal Tract Normalization and Speaker Adaptive Training[DB/OL].Proc.IEEE International Conference on Acoustics,Speech and Signal Processing,Seattle,USA,1998(5):797-800.
  • 5[5]M Pitz,H.Ney.Vocal Tract Normalization as Linear Transformation of MFCC[DB/OL].In Proc.European Conference on Speech Communication and Technology,Geneva,Switzerland,2003(9):1445-1448.
  • 6[6]A.Acero and X.Huang.Speaker and Gender Normalization for Continuous-Density Hidden Markov Models[DB/OL].in Proc.of the Int.Conf.on Acoustics,Speech,and Signal Processing.Atlanta,1996.
  • 7[7]LWelling,H Ney,S Kanthak.Speaker Adaptive Modeling by Vocal Tract Normalization[DB/OL].In IEEE Transactions on Speech and Audio Processing,2002;10(6):415-426.
  • 8[8]C J Legetter,P C Woodland.Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models[DB/OL].Computer Speech & Language,1995;(9):171-185.
  • 9杨行俊 迟惠生.语音信号数字处理[M].北京:电子工业出版社,1995,8..

共引文献22

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部