一种基于正弦激励的线性预测模型的语音转换方法被引量：2

Voice Conversion Based on Linear Prediction Model with Sinusoidal Excitation

下载PDF

导出

摘要在正弦激励模型的线性预测(LP)残差转换的基础上,提出了一种改进语音特征转换性能的语音转换方法。基于线性预测分析和综合的构架,该方法一方面通过谱包络估计声码器提取源说话人的线性预测编码(LPC)倒谱包络,并使用双线性变换函数实现倒谱包络的转换;另一方面由谐波正弦模型对线性预测残差信号建模和分解,采用基音频率变换将源说话人的残差信号转换为近似目标说话人的残差信号。最后由修正后的残差信号激励时变滤波器得到转换语音,滤波器参数通过转换得到的LPC倒谱包络实时更新。实验结果表明,该方法在主观和客观测试中都具有良好的结果,能有效地转换说话人声音特征,获得高相似度的转换语音。 By using a sinusoidal excitation method for voice spectral linear prediction （LP） residual transformation, an algorithm for the voice conversion technology is proposed to improve the target characteristics in the converted voice. The algorithm is based on the LP coding （LPC） analysis/synthesis framework and achieves LPC cepstral spectral envelope of the source speaker by the spectral envelope estimation vocoder （SEEVOC）. The spectral envelope is converted by the bilinear transform function. LP residual signals are modeled and decomposed by the harmonic sinusoidal model. Pitch modification is applied to the source speaker residual to approximate the target speaker pitch range. Then, the modified LP residual is used to excite the time varying filter. Filter parameters are updated according to the desired LPC cepstral spectral envelope. Experimental results indicate that the proposed method has a good performance in both objective and subjective tests and can convert the speaker personality with high similarity.

作者尹伟易本顺

机构地区武汉大学电子信息学院

出处《数据采集与处理》 CSCD 北大核心 2010年第2期218-222,共5页 Journal of Data Acquisition and Processing

关键词语音转换正弦模型线性残差分析 voice conversion sinusoidal model linear prediction residual analysis

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献9

1左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量：32
2Kain A.High resolution voice transformation[D].Portland,OR:OGI School of Science and Engineering,Oregon Health and Science University,2001.
3Toda T,Black A W,Tokuda K.Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory voice synthesis[C]//Proceeding of 5th ISCA Speech Synthesis Workshop.Pittsburgh,PA,USA:ISCA Press,2004:31-36.
4Percybrooks W,Moore II E.New algorithm for LPC residual estimation from LSF vectors for a VC system[C]//Proceeding of 8th International Conference on INTERSPEECH.Antwerp,Belgium:ISCA Press,2007:1977-1980.
5Paul B.The spectral envelope estimation vocoder[J].IEEE Transactions on Acoustics,Speech and Signal Processing,1981,29(1):786-794.
6Smith J O,Abel J S.Bark and ERB bilinear transform[J].IEEE Transactions on Speech and Audio Processing,1999,7(6):697-708.
7Quatieri T F,McAulay R J.Pitch estimation and voicing detection based on a sinusoidal model[J].IEEE Transactions on Acoustics,Speech and Signal Processing,1990,4(1):249-252.
8Arslan L M.Speaker transformation algorithm using segmental codebooks[J].Speech Communication,1999,28(3):211-226.
9Sreenivasa R K,Yegnanarayana B.Voice conversion by prosody and vocal tract modification[C]//Proceeding of 9th International Conference on Information Technology.Bhubaneswar,Orissa,India:IEEE Press,2006:111-116.

二级参考文献56

1H Kuwabara and Y Sagisaka.Acoustic characteristics of speaker individuality:control and conversion[J].Speech Communication.1995,16(2):165-173.
2D Klatt and L C Klatt.Analysis,synthesis,and perception of voice quality variations among female and male talkers[J].J Acoust Soc Am,1990,87(2):820-857.
3P H Milenkovic.Voice source model for continuous control of pitch period[J].J Acoust Soc Am,1993,93(2):1087-1096.
4H Matsumoto,et al.Multidimensional representation of personal quality of vowels and its acoustical correlates[J].IEEE Trans Audio and Electroacoustics,1973,21(5):428-436.
5S Furui.Research on individuality features in speech waves and automatic speaker recognition techniques [J].Speech Communication,1986,5(2):183-197.
6K S Lee,et al.A new voice transformation based on both linear and nonlinear prediction[A].Proc ICSLP[C].Philadelphia,USA:ESCA,1996.1401-1404.
7L M Arslan.Speaker transformation algorithm using segmental codebooks (STASC)[J].Speech Communication,1999,28(3):211-226.
8H Mizuno and M Abe.Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt[J].Speech Communication.1995,16(2):165-173.
9T Yoshimura,et al.Speaker interpolation in HMM-based speech synthesis system[A].Proc.Eurospeech [C].Rhodes,Greece:ESCA,1997.2523-2526.
10D G Childers.Glottal source modeling for voice conversion [J].Speech Communication.1995,16 (2):127-138.

共引文献31

1吴梅,冯瑞杰.试论一种语音转换系统的设计与实现[J].中亚信息,2010(S1):61-63.
2左国玉,刘文举,阮晓钢.语音转换技术在电话语音识别中的应用研究(英文)[J].系统仿真学报,2005,17(2):448-452.
3左国玉,刘文举,阮晓钢.一种使用声调映射码本的汉语声音转换方法[J].数据采集与处理,2005,20(2):144-149. 被引量：4
4符敏,程德福.支持向量回归在声音转换中的应用[J].电声技术,2006,30(3):45-48. 被引量：1
5张晓洲,黄德智,蔡莲红.考虑帧间动态特征的音色变换算法[J].清华大学学报（自然科学版）,2006,46(10):1767-1770. 被引量：1
6康永国,双志伟,陶建华,张维.基于混合映射模型的语音转换算法研究[J].声学学报,2006,31(6):555-562. 被引量：13
7王海祥,戴蓓蒨,陆伟,张剑.基于共振峰参数和分类线性加权的源-目标声音转换[J].中国科学技术大学学报,2006,36(11):1153-1159.
8王海祥.基于RBF神经网络的源——目标话音转换[J].电子测量技术,2006,29(6):60-63.
9孙俊,戴蓓蒨,张剑.基于基元段特征和GMM的源-目标说话人F_0～t转换[J].信号处理,2007,23(2):283-287.
10王卉,王小军,马骏.基于CMOS工艺的音频前置放大器的设计与实现[J].电子器件,2007,30(3):870-873.

同被引文献21

1左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量：32
2左国玉,刘文举,阮晓钢.一种使用声调映射码本的汉语声音转换方法[J].数据采集与处理,2005,20(2):144-149. 被引量：4
3赵力.语音信号处理[M].北京:机械工业出版社,2008.
4Stylianou Y. Voice transformation : a survey [ C ] HInternation Conference on Acoustics, Speech and Signal Processing. [ s1! 1. ]:[s. n. ] ,2009:3585-3588.
5Nakamura K, Toda T, Saruwatari H, et al. Speaking- aid sys- tems using GMM-based voice conversion for electrolaryngeal speech [ J ]. Speech Communication, 2012,54 ( 1 ) : 134- 1 46.
6Laskar R H ,Talukdar F A ,Bhattacharjee R,et al. Voice con- version by mapping the spectral and prosodic features usingsupport vector machine [ J ]. Applications of Soft Computing, 2009,58:519-528.
7Kunikoshi A, Qian Yao, Soong F, et al. Improve FO modeling and generation in voice conversion [ C ]//IEEE International Conference on Acoustics, Speech and Signal Processing. [ s. 1. ] :[ s. n. ] ,2011:4568-4571.
8Rao K S. Voice conversion by mapping the speaker-specific features using pitch synchronous approach [ J 1. Computer Speech and Language ,2010,24( 3 ) :474-494.
9陈芝,张玲华.基频轨迹转换算法及在语音转换系统中的应用研究[J].南京邮电大学学报（自然科学版）,2010,30(5):83-87. 被引量：1
10李燕萍,张玲华,丁辉.基于音素分类的汉语语声转换算法[J].南京邮电大学学报（自然科学版）,2011,31(1):10-15. 被引量：1

引证文献2

1李燕萍,张玲华.基于多时间尺度韵律特征分析的语音转换研究[J].计算机技术与发展,2012,22(12):67-70.
2李建文,朱悦.皮肤听声原理在语音合成中的应用研究[J].现代电子技术,2020,43(19):35-39.

1张金利,常海滨.高精度正弦激励信号源的设计与实现[J].陕西师范大学学报（自然科学版）,2006,34(S2):111-112.
2叶顺舟,付仕明.基于维纳滤波的改进语音增强算法研究[J].广东通信技术,2011,31(12):63-66. 被引量：2
3张冰.基于计算机仿真的语音增强算法研究[J].中国科技纵横,2015,0(9):25-26.
4汪金山.正弦激励下的脉冲功率电源的研究[J].黄冈师范学院学报,1999,19(4):44-46.
5王扬,孟润泉.动态耳机非线性特性的研究[J].电声技术,2006,30(7):16-19.
6盛玉霞,崔慧娟,唐昆.基于正弦激励的3.6Kb/s低复杂度语音编码算法[J].电声技术,2008,32(1):56-59. 被引量：3
7秦焕丁,娄景艺,刘昭.基于最小均方误差幅度谱的改进语音增强算法研究[J].电子技术（上海）,2016,0(7):11-14. 被引量：4
8欧阳玲,宋克.一种基于FPGA实现的改进语音端点检测算法[J].中原工学院学报,2011,22(1):70-73. 被引量：1
9黄杰,王宏禹.预处理自适应滤波算法[J].数据采集与处理,1995,10(2):87-91. 被引量：1
10张玲华,杨震,郑宝玉.一种修正的倒谱公式及其在说话人识别中的应用[J].信号处理,2003,19(z1):121-124.

数据采集与处理

2010年第2期

浏览历史

内容加载中请稍等...

一种基于正弦激励的线性预测模型的语音转换方法被引量：2

参考文献9

二级参考文献56

共引文献31

同被引文献21

引证文献2

相关作者

相关机构

相关主题

浏览历史

一种基于正弦激励的线性预测模型的语音转换方法 被引量：2

参考文献9

二级参考文献56

共引文献31

同被引文献21

引证文献2

相关作者

相关机构

相关主题

浏览历史

一种基于正弦激励的线性预测模型的语音转换方法被引量：2