基于分类线性加权的源-目标话者声音转换算法的研究被引量：1

Voice conversion from source speaker to target speaker based on classified linearly weighted transformation

下载PDF

导出

摘要源-目标话者的声音转换是一种变换说话人声音特性的技术,它将源说话人的声音转换成另一个指定的目标说话人的声音。对源话者声道谱特性的修改是声音转换的关键之一。为了克服一般分类线性转换算法中分类不准确所带来的误差,本文引入了分类线性加权转换的策略,根据不同子类的转换函数对谱特性的贡献,赋予不同的加权系数,给出了一种基于GMM后验概率加权的线性转换算法。在微软汉语普通话语音数据库上做的四组对比实验表明,该算法在谱转换性能上均有不同程度的提高。 voice conversion technique aims to modify the source speaker＇s speech to make it sound like a designated target speaker＇s speech, of which the spectral envelope mapping algorithm is the key part. A classified linearly transformation is introduced to reduce transformation error caused by inaccurate classification. Different weighted values are added based on the contribution of each class to the whole spectral envelope, and a weighted linearly transformation based on the GMM posterior probability is presented. Experimental results show the proposed algorithm can improve the performance of converted spectral envelope.

作者张剑戴蓓蒨孙俊陆伟李辉

机构地区中国科学技术大学电子科学与技术系

出处《电路与系统学报》 CSCD 北大核心 2008年第3期106-110,105,共6页 Journal of Circuits and Systems

关键词声音转换源-目标话者声道谱转换高斯混合模型分类线性转换分类线性加权转换 voice conversion the source-target speaker spectral envelope transformation Gauss mixture model classified linearly transformation classified linearly weighted transformation

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献8

1E Moulines, et al. Voice conversion: state of the art and perspectives [J]. Elsevier, 1995-02, 16(2): 125-126.
2M Abe, et al. Voice conversion through vector quantization [A]. Proceedings of ICASSP [C]. 1988, 1: 655-658.
3左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量：32
4H Valbret, et al. Voice transformation using PSLOA technique [J]. Speech Communication, 1992, 11:175-187.
5Ye Hui, Young Steve. Perceptually Weighted Linear Transformation for Voice Conversion [A]. Proceedings of Eurospeech [C]. 2003. 2409-2412.
6Erie Chang, Y Shi, J Zhou, C Huang. Speech lab in a box: a mandarin speech toolbox to jumpstart speech related research [A]. Proceedings of Eurospeech [C]. 2001. 2799-2802.
7Athanaslos Monchtaris, el aL Non-parallel training for voice conversion by maximum likelihood constrained adaptation [A]. Proceedings of ICASSP [C]. 2004-05, 1: 1-4.
8A Kain, M. Macon. Spectral voice conversion for text-to-speech synthesis [A]. Proceedings of ICASSP [C]. 1998-05, 1 : 285-288.

二级参考文献56

1H Kuwabara and Y Sagisaka.Acoustic characteristics of speaker individuality:control and conversion[J].Speech Communication.1995,16(2):165-173.
2D Klatt and L C Klatt.Analysis,synthesis,and perception of voice quality variations among female and male talkers[J].J Acoust Soc Am,1990,87(2):820-857.
3P H Milenkovic.Voice source model for continuous control of pitch period[J].J Acoust Soc Am,1993,93(2):1087-1096.
4H Matsumoto,et al.Multidimensional representation of personal quality of vowels and its acoustical correlates[J].IEEE Trans Audio and Electroacoustics,1973,21(5):428-436.
5S Furui.Research on individuality features in speech waves and automatic speaker recognition techniques [J].Speech Communication,1986,5(2):183-197.
6K S Lee,et al.A new voice transformation based on both linear and nonlinear prediction[A].Proc ICSLP[C].Philadelphia,USA:ESCA,1996.1401-1404.
7L M Arslan.Speaker transformation algorithm using segmental codebooks (STASC)[J].Speech Communication,1999,28(3):211-226.
8H Mizuno and M Abe.Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt[J].Speech Communication.1995,16(2):165-173.
9T Yoshimura,et al.Speaker interpolation in HMM-based speech synthesis system[A].Proc.Eurospeech [C].Rhodes,Greece:ESCA,1997.2523-2526.
10D G Childers.Glottal source modeling for voice conversion [J].Speech Communication.1995,16 (2):127-138.

共引文献31

1吴梅,冯瑞杰.试论一种语音转换系统的设计与实现[J].中亚信息,2010(S1):61-63.
2左国玉,刘文举,阮晓钢.语音转换技术在电话语音识别中的应用研究(英文)[J].系统仿真学报,2005,17(2):448-452.
3左国玉,刘文举,阮晓钢.一种使用声调映射码本的汉语声音转换方法[J].数据采集与处理,2005,20(2):144-149. 被引量：4
4符敏,程德福.支持向量回归在声音转换中的应用[J].电声技术,2006,30(3):45-48. 被引量：1
5张晓洲,黄德智,蔡莲红.考虑帧间动态特征的音色变换算法[J].清华大学学报（自然科学版）,2006,46(10):1767-1770. 被引量：1
6康永国,双志伟,陶建华,张维.基于混合映射模型的语音转换算法研究[J].声学学报,2006,31(6):555-562. 被引量：13
7王海祥,戴蓓蒨,陆伟,张剑.基于共振峰参数和分类线性加权的源-目标声音转换[J].中国科学技术大学学报,2006,36(11):1153-1159.
8王海祥.基于RBF神经网络的源——目标话音转换[J].电子测量技术,2006,29(6):60-63.
9孙俊,戴蓓蒨,张剑.基于基元段特征和GMM的源-目标说话人F_0～t转换[J].信号处理,2007,23(2):283-287.
10王卉,王小军,马骏.基于CMOS工艺的音频前置放大器的设计与实现[J].电子器件,2007,30(3):870-873.

同被引文献5

1王薇,杨震.基于GMM的语音转换系统性能研究[C]//第十四届全国信号处理学术年会(CCSP-2009)论文集.湖南长沙:中国电子学会信号处理分会.2009:175-178.
2赵恒,李冬梅,张玉宏.MATLAB环境下的基于GMM模型的说话人识别系统[J].微计算机信息,2007,23(31):261-263. 被引量：6
3张凯,朱立新,赵义正.基于重训练高斯混合模型的语音转换方法[J].声学技术,2010,29(1):52-55. 被引量：4
4赵义正.改进GMM谱包络转换性能的语音转换算法研究[J].科学技术与工程,2010,10(17):4172-4174. 被引量：3
5潘渊.声音转换及相关技术的研究[J].今日科苑,2010(22):113-113. 被引量：1

引证文献1

1徐欣,李枚亭.基于频谱包络算法的语音转换研究[J].数字技术与应用,2011,29(9):123-125. 被引量：1

二级引证文献1

1王瑶,龙华,邵玉斌,杜庆治.可变时长的短时广播语音多语种识别[J].云南大学学报（自然科学版）,2022,44(3):490-496. 被引量：2

1王海祥,戴蓓蒨,陆伟,张剑.基于共振峰参数和分类线性加权的源-目标声音转换[J].中国科学技术大学学报,2006,36(11):1153-1159.
2王海祥.基于RBF神经网络的源——目标话音转换[J].电子测量技术,2006,29(6):60-63.
3李思奇,陈怀新.基于联合概率加权的高分辨雷达目标点迹处理[J].电讯技术,2014,54(6):780-784. 被引量：3
4王缓缓,宫娜娜.基于距离区间概率加权的RSSI测距方法[J].电子科技大学学报,2013,42(6):862-868. 被引量：12
5谢桂月,肖军.G.652和G.655光纤组合应用应注意的问题[J].广东通信技术,2006,26(2):7-12.
6赵兴录.雷达数据的概率加权综合法—分布式雷达网数据综合问题研究[J].航天电子对抗,1989(4):37-41.
7左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量：32
8蔡胜兵,段哲民,高进,许家栋.概率加权质心跟踪算法研究[J].红外与激光工程,2008,37(4):621-624. 被引量：7
9张军,韦岗,余华.基于特征分量输出概率加权的多数据流鲁棒语音识别方法[J].声学学报,2008,33(2):102-108. 被引量：2
10熊伟,张晶炜,何友.基于S-D分配的多传感器联合概率数据互联算法[J].清华大学学报（自然科学版）,2005,45(4):452-455. 被引量：3

电路与系统学报

2008年第3期

浏览历史

内容加载中请稍等...

基于分类线性加权的源-目标话者声音转换算法的研究被引量：1

参考文献8

二级参考文献56

共引文献31

同被引文献5

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于分类线性加权的源-目标话者声音转换算法的研究 被引量：1

参考文献8

二级参考文献56

共引文献31

同被引文献5

引证文献1

二级引证文献1

相关作者

相关机构

相关主题

浏览历史

基于分类线性加权的源-目标话者声音转换算法的研究被引量：1