基于高斯混合模型和残差预测的说话人转换系统被引量：4

A voice conversion system based on GMM and residual prediction

下载PDF

导出

摘要说话人转换是将源说话人的语音特征转换成目标说话人的特征,使得听起来像是目标说话人的语音。提出的说话人转换系统分为2个部分,第一部分利用高斯混合模型进行谱包络的转换,训练采用时间对齐的源说话人和目标说话人的语音数据进行。第二部分基于一个分类器和残差码本对残差信号预测。该系统在现有的说话人转换系统的基础上做了一些改进,改进后不再需要说话人模仿别人的语调,并且在某些性能上超过了现有的系统。 Voice conversion is the process of transforming the characteristics of speech uttered by a source speaker, such that a listener would believe that the speech was uttered by a target speaker. In this paper, the system is divided into two main parts. By using a Gaussian mixture model, which is trained on aligned speech from source and target speakers, the first part transforms the spectral envelope. The second part of the system predicts the spectral detail from the transformed LPC parameters, which is based on a classifier and residual codebooks. The system has some similarities with some existing systems, however, this system is not restricted to speech spoken in a monotone and with mimicked prosody. Also, on the basis of some performance metrics it outperforms existing systems.

作者吕声尹俊勋黄建成

机构地区华南理工大学电子与信息学院摩托罗拉中国研究中心

出处《电声技术》北大核心 2004年第6期33-36,共4页 Audio Engineering

关键词说话人转换高斯混合模型残差预测谱包络 voice conversion Gaussian mixture model residual prediction

分类号 TN912.33 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献4

1Kain A., Macon M.W. Design and evaluation of a voice conversion algorithm based on spectral envelope mapping and residual prediction. In IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings,2001,2:813-816.
2Arslan L. Speaker transformation algorithm using segment codebook. Speech Communication Journal. 1999, 28:211-226.
3Y. Stylianou, O. Cappe, E. Moulines. Statistical method for voice quality transformation. In Proc. EUROSPPECH, 1995.
4Y. Stylianou, O. Cappe, E. Moulines. Continuous probabilistic transform for voice conversion In IEEE Transaction on speech and audio processing, 1998,6 (2):131-142.

同被引文献17

1康永国,双志伟,陶建华,张维.基于混合映射模型的语音转换算法研究[J].声学学报,2006,31(6):555-562. 被引量：13
2张凯朱立新赵义正.改进的基于高斯混合模型的语音转换方法研究.声学技术,2008,27(3):392-397.
3Yannis Stylianou, Olivier Cappe, Eric Moulines. Continuous probabilistic transform for voice conversion[J]. Transactions on Speech and Audio Processing, 1998, 6(2): 131-142.
4Kain. High resoulation voice transformation[D]. Computer Science and Mathematics, Rockford College, 1995, 47-52.
5QIN Long, CHEN Gaopeng, LING Zhenghua. An improved spectral and prosodic transformation methed in STRAIGHT-based voice conversion[A]. ICASSP[C]. 2005, 21-24.
6Toda T,Alan W B,Kellchi.Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter.Proceedings of ICASSP2005,2005,1:9-12.
7Chen Yining,Chu Min,Chang E,et al.Voice conversion with smoothed GMM and MAP adaptation.Proc Eurospeech Geneva,Switzerland:ISCA,Sept,2003:2413-2416.
8Toda T.High-quality and flexible speech synthesis with segment selection and voice conversion.Graduate School of Information Science,Nara Institute of Science and Technology,2003.
9Stylianou Y, Cappe O, Moulines E. Continuous Probabilistic Transformation for Voice Conversion. Speech and Audio Processing IEEE, 1998, (6) : 131-142.
10Abe M. A Segment-based Approach to Voice Conversion. Proc IEEE ICASSP, 1991,(2):765-768.

引证文献4

1夏菁,尹俊勋,黄建成,黄锋.基于正弦加噪声模型的说话人转换方法[J].电声技术,2005,29(2):49-52. 被引量：1
2张凯,朱立新,赵义正.基于重训练高斯混合模型的语音转换方法[J].声学技术,2010,29(1):52-55. 被引量：4
3赵义正.改进GMM谱包络转换性能的语音转换算法研究[J].科学技术与工程,2010,10(17):4172-4174. 被引量：3
4赵义正.一种改进高斯混合模型均值项的语音转换方法[J].微型机与应用,2012,31(19):68-70.

二级引证文献6

1郭昕,于凤芹.基于匹配追踪与子空间联合的语音增强[J].电声技术,2008,32(9):52-55. 被引量：1
2徐欣,李枚亭.基于频谱包络算法的语音转换研究[J].数字技术与应用,2011,29(9):123-125. 被引量：1
3翟继友,张鹏.高斯混合模型参数估值算法的优化[J].计算机技术与发展,2011,21(11):145-148. 被引量：7
4周克良,王亚光,叶岑.心音信号特征分析与识别方法研究[J].广西师范大学学报（自然科学版）,2015,33(3):34-44. 被引量：4
5沈惠玲,万永菁.一种基于预测谱偏移的自适应高斯混合模型在语音转换中的应用[J].华东理工大学学报（自然科学版）,2017,43(4):546-552. 被引量：2
6王琳,黄浩.引入预训练表示混合矢量量化和CTC的语音转换[J].计算机工程,2024,50(4):313-320.

1吕声,尹俊勋.同语种说话人转换的实现[J].移动通信,2004,0(S3):24-27.
2赵海峰,杨曼,阙大顺,冯程程.基于个性特征的语音合成转换系统设计[J].电脑与信息技术,2012,20(3):17-20. 被引量：1
3张传福,吴伟陵.用于混合业务的基于信号预测的切换算法[J].北京邮电大学学报,2002,25(4):51-55.
4张炳,俞一彪.基于改进GMM和韵律联合短时谱的说话人转换[J].信号处理,2009,25(4):548-552. 被引量：2
5李亚,卜智勇.基于信号预测的自适应切换方式[J].微计算机信息,2009,25(3):93-94. 被引量：2
6李炎强.多机位录制技术在综艺节目中的应用[J].现代电视技术,2017,0(3):116-121.
7杜佳,陈砚圃,杨俊强.特定说话人之间声学特征参数研究[J].计算机应用,2009,29(B12):275-278. 被引量：2
8陈国明,钟莉莉.三维视频通信中视差技术的改进[J].广州航海学院学报,2016,24(1):32-36.
9夏菁,尹俊勋,黄建成,黄锋.基于正弦加噪声模型的说话人转换方法[J].电声技术,2005,29(2):49-52. 被引量：1
10王科南,郝士琦,薛磊,王春迎.基于混沌理论的跳频网信号预测性能分析[J].航天电子对抗,2004,33(6):48-50.

电声技术

2004年第6期

浏览历史

内容加载中请稍等...

基于高斯混合模型和残差预测的说话人转换系统被引量：4

参考文献4

同被引文献17

引证文献4

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

基于高斯混合模型和残差预测的说话人转换系统 被引量：4

参考文献4

同被引文献17

引证文献4

二级引证文献6

相关作者

相关机构

相关主题

浏览历史

基于高斯混合模型和残差预测的说话人转换系统被引量：4