AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE

AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE

下载PDF

导出

摘要 This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models(GMM) models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible. This paper improves and presents an advanced method of the voice conversion system based on Gaussian Mixture Models（GMM） models by changing the time-scale of speech.The Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum（STRAIGHT） model is adopted to extract the spectrum features,and the GMM models are trained to generate the conversion function.The spectrum features of a source speech will be converted by the conversion function.The time-scale of speech is changed by extracting the converted features and adding to the spectrum.The conversion voice was evaluated by subjective and objective measurements.The results confirm that the transformed speech not only approximates the characteristics of the target speaker,but also more natural and more intelligible.

作者 Zhou Ying Zhang Linghua

机构地区 College of Telecommunications ＆ Information Engineering

出处《Journal of Electronics(China)》 2011年第4期518-523,共6页 电子科学学刊（英文版）

基金 Supported by the National Natural Science Foundation of China (No. 60872105) the Program for Science & Technology Innovative Research Team of Qing Lan Project in Higher Educational Institutions of Jiangsu the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD)

关键词 Gaussian Mixture Models(GMM) Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum(STRAIGHT) TIME-SCALE Voice conversion Gaussian Mixture Models（GMM） Speech Transformation and Representation using Adaptive Interpolation of weiGHTed spectrum（STRAIGHT） Time-scale Voice conversion

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献10

1黄昊,郭立,李琳.基于感知敏感成分划分的语音时长规整算法[J].数据采集与处理,2008,23(6):740-745. 被引量：4
2Kana.High resolution voice conversion. . 2001
3T. Toda,H. Saruwatari,K. Shikano.High quality voice conversion based on Gaussian mixture model with dynamic frequency warping. European Confer- ence on Speech Communication and Technology . 2001
4Sawako Shibata,Hiroto Saito,Shogo Nakamura.A time scale modification using Hierarchical structure CIC filter and sinusoidal representation. 2005 RISP International Workshop on Nonlinear Circuits and Signal Proccssing . 2005
5D. Erro,A. Moreno,A. Bonafonte.Voice con- version based on weighted frequency warping. IEEE Transactions on Audio,Speech,and Language Proc- essing . 2010
6Srinivas Desai,E Veera Raghavendra,B. Yeg- nanarayana.Voice conversion using artificial neural networks. IEEE International Conference on Acous- tics Speed and Signal Processing Proceedings (ICASSP) . 2009
7Allam Mousa.Voice conversion using pitch shifting algorithm by time stretching with PSOLA and re- sampling. Journal of Electrical Engineering . 20101
8Arslan L.M,Talkin D.Voice conversion by codebook mapping of line spectral frequencies and excitation spectrum. Proceedings of the EUROSPEECH . 1997
9K.S.Lee."Statistical Approach for Voice Personality Transformation,". IEEE Trans.on audio,speech,and language processing . 2007
10Kawahara H,Masuda-katsuse I,De Cheveign A.Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0extraction:possible role of a repetitive structure in sounds. Speech Communication . 1999

二级参考文献7

1Wong P H W,Au, O C. Fast SOLA-based time-scale modification using modified envelope matching [C]//Proceedings of ICASSP 2002. Hong Kong, China:[s. n.],2002.
2Makhoul J, El-jaroudi A. Time-scale modification in medium to low rate speech coding[J]. Proc ICASSP, 1986,311075-1078.
3Philipos C L. Mimicking the human ear[J].IEEE Signal Processing Magazine, 1998,15(5) : 101-130.
4Fmui S. On the role of spectral transition for speechperception[J].J Acoust Soc Amer, 1986, 80(4): 1016-1025.
5Stevens K N. Acoustic correlates of some phonetic categories[J].J Acoust Soc Amer, 1980,68(3):836- 842.
6Rabiner L, Juang B H. Fundamentals of speech recognition [M]. Englewood Cliffs, N J: Prentice-Hall, 1993: 100-117.
7Deller J R, Hansen J H L, Proakis J G. Discretetime processing of speech signals[M]. New York, USA:Macmillan Publishing Company, 1993: 289-303.

共引文献3

1莫双燕,关海欣,郑可欣.语音时长调整快速算法[J].声学技术,2010,29(5):507-511. 被引量：1
2汪石农,许钢.改进相位声码器的音频时长变换算法研究[J].计算机工程与应用,2012,48(36):155-159.
3雷颖思,杨燕.基于语音转折点检测的改进波形相似叠加时长规整算法[J].计算机工程,2015,41(10):260-264. 被引量：1

1Xiaoming Zhao,Xijian Ye,Shenzhou Zheng.An Improved Algorithm of Harmonic Mean Filter[J].Journal of Systems Science and Information,2006,4(4):697-703.
2双志伟,Raimo Bakis,秦勇.IBM Voice Conversion Systems for 2007 TC-STAR Evaluation[J].Tsinghua Science and Technology,2008,13(4):510-514. 被引量：2
3Zhu, Xiaoguang, Hong, Bingrong, Wang, Dongmu.Implementation of Time-Scale Transformation Based on Continuous Wavelet Theory[J].Journal of Systems Engineering and Electronics,2000,11(1):32-37. 被引量：2
4钟幼平,匡兴红,黄佩伟.Improved Algorithm for Distributed Localization in Wireless Sensor Networks[J].Journal of Shanghai Jiaotong university(Science),2010,15(1):64-69. 被引量：3
5JIAN Zhihua,WANG Xiangwen.A modified voice conversion algorithm using compressed sensing[J].Chinese Journal of Acoustics,2014,33(3):323-333. 被引量：8
6Wang Zuliang,Zheng Mao,Wang Juan,Zheng Linhua.Improved algorithm of atmospheric refraction error in Longley-Rice channel model[J].Journal of Systems Engineering and Electronics,2008,19(4):683-687. 被引量：2
7Xie Xiang Li Cruolin Zhang Chun Zhang Li Wang Zhihua.Improved algorithm for RDO in JPEG2000 encoder and its IC design[J].Journal of Systems Engineering and Electronics,2006,17(2):430-436. 被引量：1
8郑近德,程军圣,杨宇.基于改进的ITD和模糊熵的滚动轴承故障诊断方法[J].中国机械工程,2012,23(19):2372-2377. 被引量：31
9Luis Mauricio Gutierrez-Begovich,Mario Eduardo Rivero-Angeles.Non-saturation Throughput of S-ALOHA Using the Time-Scale Decomposition Technique[J].Journal of Mechanics Engineering and Automation,2014,4(2):116-122.
10Wei Qing Yang Shaoquan Luo Ming Dong Chunxi.IMPROVED ALGORITHM FOR STRIPMAP SAR IMAGING[J].Journal of Electronics(China),2006,23(2):216-219.

Journal of Electronics(China)

2011年第4期

浏览历史

内容加载中请稍等...

AN IMPROVED ALGORITHM OF GMM VOICE CONVERSION SYSTEM BASED ON CHANGING THE TIME-SCALE

参考文献10

二级参考文献7

共引文献3

相关作者

相关机构

相关主题

浏览历史