期刊文献+

AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE

AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE
下载PDF
导出
摘要 This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible. This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible.
出处 《Journal of Electronics(China)》 2011年第4期518-523,共6页 电子科学学刊(英文版)
基金 Supported by the National Natural Science Foundation of China (No. 60872105) the Program for Science & Technology Innovative Research Team of Qing Lan Project in Higher Educational Institutions of Jiangsu the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)
关键词 Gaussian Mixture Models(GMM) Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) TIME-SCALE Voice conversion Gaussian Mixture Models(GMM) Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) Time-scale Voice conversion
  • 相关文献

参考文献10

  • 1黄昊,郭立,李琳.基于感知敏感成分划分的语音时长规整算法[J].数据采集与处理,2008,23(6):740-745. 被引量:4
  • 2Kana.High resolution voice conversion. . 2001
  • 3T. Toda,H. Saruwatari,K. Shikano.High quality voice conversion based on Gaussian mixture model with dynamic frequency warping. European Confer- ence on Speech Communication and Technology . 2001
  • 4Sawako Shibata,Hiroto Saito,Shogo Nakamura.A time scale modification using Hierarchical structure CIC filter and sinusoidal representation. 2005 RISP International Workshop on Nonlinear Circuits and Signal Proccssing . 2005
  • 5D. Erro,A. Moreno,A. Bonafonte.Voice con- version based on weighted frequency warping. IEEE Transactions on Audio,Speech,and Language Proc- essing . 2010
  • 6Srinivas Desai,E Veera Raghavendra,B. Yeg- nanarayana.Voice conversion using artificial neural networks. IEEE International Conference on Acous- tics Speed and Signal Processing Proceedings (ICASSP) . 2009
  • 7Allam Mousa.Voice conversion using pitch shifting algorithm by time stretching with PSOLA and re- sampling. Journal of Electrical Engineering . 20101
  • 8Arslan L.M,Talkin D.Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. Proceedings of the EUROSPEECH . 1997
  • 9K.S.Lee."Statistical Approach for Voice Personality Transformation,". IEEE Trans.on audio,speech,and language processing . 2007
  • 10Kawahara H,Masuda-katsuse I,De Cheveign A.Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0extraction:possible role of a repetitive structure in sounds. Speech Communication . 1999

二级参考文献7

  • 1Wong P H W,Au, O C. Fast SOLA-based time-scale modification using modified envelope matching [C]//Proceedings of ICASSP 2002. Hong Kong, China:[s. n.],2002.
  • 2Makhoul J, El-jaroudi A. Time-scale modification in medium to low rate speech coding[J]. Proc ICASSP, 1986,311075-1078.
  • 3Philipos C L. Mimicking the human ear[J].IEEE Signal Processing Magazine, 1998,15(5) : 101-130.
  • 4Fmui S. On the role of spectral transition for speechperception[J].J Acoust Soc Amer, 1986, 80(4): 1016-1025.
  • 5Stevens K N. Acoustic correlates of some phonetic categories[J].J Acoust Soc Amer, 1980,68(3):836- 842.
  • 6Rabiner L, Juang B H. Fundamentals of speech recognition [M]. Englewood Cliffs, N J: Prentice-Hall, 1993: 100-117.
  • 7Deller J R, Hansen J H L, Proakis J G. Discretetime processing of speech signals[M]. New York, USA:Macmillan Publishing Company, 1993: 289-303.

共引文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部