期刊文献+

Research of whispered speech vocal tract system conversion based on universal background model and effective Gaussian components 被引量:1

Research of whispered speech vocal tract system conversion based on universal background model and effective Gaussian components
原文传递
导出
摘要 Directing to the weakness of the present fixed values mapping methods (method_F), a vocal tract system conversion method based on the universal background model (UBM) is proposed for improving the performance of the speech conversion system from Chinese whis- pered speech to normal speech. For the numerous components of UBM, the errors produced by the acoustical probability density statistical model can't be ignored. Thus an effective Gaus- sian mixture components chosen method based on the posterior probability summation of the minimum spectral distortion is developed to optimizing the system performance. The proposed method (method_U) is analyzed and compared using the performance index (PI) based on Itakura-Saito spectral distortion measure. It is shown experimentally that the performance of method_U is more stability for different speakers and different phonemes than that of method_F. The average PI of method_U is better than method_F. It is shown that by selecting effective Gaussian mixture components, the PI of method_U can be further improved 5.11%. Subjective auditory tests also show that the proposed method can improve the definition and intelligibility of conversion speech. Directing to the weakness of the present fixed values mapping methods (method_F), a vocal tract system conversion method based on the universal background model (UBM) is proposed for improving the performance of the speech conversion system from Chinese whis- pered speech to normal speech. For the numerous components of UBM, the errors produced by the acoustical probability density statistical model can't be ignored. Thus an effective Gaus- sian mixture components chosen method based on the posterior probability summation of the minimum spectral distortion is developed to optimizing the system performance. The proposed method (method_U) is analyzed and compared using the performance index (PI) based on Itakura-Saito spectral distortion measure. It is shown experimentally that the performance of method_U is more stability for different speakers and different phonemes than that of method_F. The average PI of method_U is better than method_F. It is shown that by selecting effective Gaussian mixture components, the PI of method_U can be further improved 5.11%. Subjective auditory tests also show that the proposed method can improve the definition and intelligibility of conversion speech.
出处 《Chinese Journal of Acoustics》 2013年第4期400-410,共11页 声学学报(英文版)
基金 supported by the National Natural Science Foundation of China(61071215) the Science and Technology Foundation of Suzhou(SYG201033) the Pre-research Foundation of Soochow University(Q311901111,14317399)
  • 相关文献

参考文献2

二级参考文献44

  • 1左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量:32
  • 2栗学丽,丁慧,徐柏龄.基于熵函数的耳语音声韵分割法[J].声学学报,2005,30(1):69-75. 被引量:34
  • 3林玮,杨莉莉,徐柏龄.基于修正MFCC参数汉语耳语音的话者识别[J].南京大学学报(自然科学版),2006,42(1):54-62. 被引量:23
  • 4Taisuke Ito,Kazuya Takeda,Fumitada Itakura.Analysis and recognition of whispered speech.Speech Communication,2005; 45(2):139-152.
  • 5Chi Zhang,Hansen J H L.Analysis and classification of speech mode:whispered through shouted.INTER-SPEECH,2007:2289-2298.
  • 6Jin Q,Jou S S,Schultz T.Wisphering speaker identification.IEEE ICME,2007:1021-1024.
  • 7FAN Xing,Hansen J H L.Speaker identification for whispered speech based on frequency warping and score competition.INTERSPEECH,2008:1313-1316.
  • 8Teager H M,Teager S M.Some observation on oral airflow during phonation.IEEE Trans on Acoustic,Speech,and Signal Processing,1980; 28(5):599-601.
  • 9Maragos P,Kaiser J F,Quatieri T F.Energy separation in signal modulations with application to speech analysis.IEEE Trans on Signal Process,1993; 40(10):3024-3051.
  • 10Bovik A C,Maragos P,Quatieri T F.AM-FM energy detection and separation in noise using multiband energy operators.IEEE Trans on Signal Processing,1993; 41(12):3245-3265.

共引文献22

同被引文献16

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部