ON USING NON-LINEAR CANONICAL CORRELATION ANALYSIS FOR VOICE CONVERSION BASED ON GAUSSIAN MIXTURE MODEL

ON USING NON-LINEAR CANONICAL CORRELATION ANALYSIS FOR VOICE CONVERSION BASED ON GAUSSIAN MIXTURE MODEL

下载PDF

导出

摘要 Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality.The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis(NLCCA) based on jointed Gaussian mixture model.Speaker indi-viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies(LSF).To obtain the transformed speech which sounded more like the target voices,prosody modification is involved through residual prediction.Both objective and subjective evaluations were conducted.The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error(MMSE) estimation. Voice conversion algorithm aims to provide high level of similarity to the target voice with an acceptable level of quality. The main object of this paper was to build a nonlinear relationship between the parameters for the acoustical features of source and target speaker using Non-Linear Canonical Correlation Analysis （NLCCA） based on jointed Gaussian mixture model. Speaker indi- viduality transformation was achieved mainly by altering vocal tract characteristics represented by Line Spectral Frequencies （LSF）. To obtain the transformed speech which sounded more like the target voices, prosody modification is involved through residual prediction. Both objective and subjective evaluations were conducted. The experimental results demonstrated that our proposed algorithm was effective and outperformed the conventional conversion method utilized by the Minimum Mean Square Error （MMSE） estimation.

作者 Jian Zhihua Yang Zhen

机构地区 School of Communication Engineering School of Communication and Information Engineering

出处《Journal of Electronics(China)》 2010年第1期1-7,共7页 电子科学学刊（英文版）

基金 Supported by the National High Technology Research and Development Program of China (863 Program,No.2006AA010102)

关键词 Speech processing Voice conversion Non-Linear Canonical Correlation Analysis(NLCCA) Gaussian Mixture Model(GMM) Speech processing Voice conversion Non-Linear Canonical Correlation Analysis （NLCCA） Gaussian Mixture Model （GMM）

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献10

1Jian Zhihua,Yang Zhen.Voice conversion using canonical correlation analysis based on Gaussian mixture model[].th ACIS International Conference on Software EngineeringArtificial IntelligenceNet-workingand Parallel/Distributed Computing.2007
2E.Moulines,et al.Voice conversion: state of the art and perspectives[].Speech Communication.1995
3Arslan L M.Speaker Transformation Algorithm Using Segmental Codebooks (STASC)[].Speech Communication.1999
4Narendranath M,Murthy H M,Rajendran S,et al.Transformation of formants for voice conversion using artificial neural networks[].Speech Communication.1995
5Stylianou Y,Cappe O,Moulines E.Continuous probabilistic transform for voice conversion[].IEEE Transactions on Speech and Audio Processing.1998
6Kain A,Macon M.Spectral voice conversion for text-to-speech synthesis[].Proceedings of the IEEE International Conference on Acoustics Speech and Signal Processing.1998
7C. H. Wu,C. C. Hsia,T. H. Liu,,and J. F. Wang.Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis[].IEEE Trans on Audio Speech and Language Processing.2006
8O. Turk,and L. M. Arslan.Robust processing tech- niques for voice conversion[].Computer Speech and Language.2006
9LEE C L,CHANG W W,CHIANG Y C.Spectral and prosodictransformations of hearing-impaired Mandarin speech[].Space Communications.2006
10K. Shikano,,S. Nakamura,and M. Abe.Speaker ad- aptation and voice conversion by codebook mapping[].IEEE Proceeding of ISCAS.1991

1卞金洪,王吉林,余威风,赵力.基于核主分量分析和典型相关分析的语音情感识别[J].数据采集与处理,2014,29(2):222-226. 被引量：3
2Jian Zhihua Yang Zhen.A NOVEL ALGORITHM FOR VOICE CONVERSION USING CANONICAL CORRELATION ANALYSIS[J].Journal of Electronics(China),2008,25(3):358-363.
3王宇霞,赵清杰,赵留军.基于FREAK和P3CA的鲁棒目标跟踪[J].计算机学报,2015,38(6):1188-1201. 被引量：6
4李文平,杨静,印桂生,张健沛.数据场典型相关分析及其在图像分割中的应用[J].自动化学报,2015,41(4):772-784. 被引量：3
5JIAN Zhihua,WANG Xiangwen.A modified voice conversion algorithm using compressed sensing[J].Chinese Journal of Acoustics,2014,33(3):323-333. 被引量：8
6XIE Jin,QIU Tian-shuang,LIU Wen-hong.An Ocular Artifacts Removal Method Based on Canonical Correlation Analysis and Two-Channel EEG Recordings[J].Chinese Journal of Biomedical Engineering(English Edition),2012,21(2):60-66. 被引量：1
7XU Ning,BAO JingYi,LIU XiaoFeng,JIANG AiMing,TANG YiBing.Voice conversion towards modeling dynamic characteristics using switching state space model[J].Science China(Information Sciences),2013,56(12):233-247.
8陈飞,吕绍和,李军,王晓东,窦勇.目标提取与哈希机制的多标签图像检索[J].中国图象图形学报,2017,22(2):232-240. 被引量：11
9杨茂龙,孙权森,夏德深,袁珏.二维典型相关分析的实质与改进算法[J].解放军理工大学学报（自然科学版）,2009,10(6):517-522.
10彭岩,张道强.半监督典型相关分析算法[J].软件学报,2008,19(11):2822-2832. 被引量：32

Journal of Electronics(China)

2010年第1期

浏览历史

内容加载中请稍等...

ON USING NON-LINEAR CANONICAL CORRELATION ANALYSIS FOR VOICE CONVERSION BASED ON GAUSSIAN MIXTURE MODEL

参考文献10

相关作者

相关机构

相关主题

浏览历史