基于混合线性变换的语声转换算法被引量：2

An Algorithm for Voice Conversion Based on Mixtures of Linear Transformation

下载PDF

导出

摘要针对在没有对称语音库的情况下,该文提出了一种基于混合线性变换的语声转换算法,在最大似然估计准则下,使用EM迭代算法计算变换函数的参量。为了减小线性加权对语音谱包络的平滑作用,使用线性调频Z变换来调节语音信号的LPC系数。客观评测和主观感受的实验结果都表明,基于混合线性变换的语声转换算法也可以取得与传统语声转换技术相当的转换效果,解除了传统语声转换技术需要对称语音库的要求。 This paper proposes an algorithm for voice conversion based on mixtures of linear transformation which avoids the need for parallel training corpus inherent in conventional approaches. In maximum likelihood framework the EM algorithm is used to compute the parameters of the transfer function. And the chirp Z-transform is utilized to enhance the smoothed spectral envelop due to the linear weighted averaging. The proposed voice conversion system is evaluated using both objective and subjective measures. The experiment results demonstrate that the proposed approach is capable of effectively transforming speaker identity and can achieve comparable results of the conventional methods where a parallel corpus is needed.

作者简志华杨震

机构地区南京邮电大学信号与信息处理研究所

出处《电子与信息学报》 EI CSCD 北大核心 2007年第7期1700-1702,共3页 Journal of Electronics & Information Technology

基金江苏省青蓝工程项目(QL003YZ)资助课题

关键词语声转换混合线性变换最大期望算法线性调频Z变换 Voice conversion Ms-LT EM algorithm Chirp Z-transform

分类号 TN912.3 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献12

1Childers D G,Wu K,and Hicks D M,et al..Voice conversion.Speech Communication,1989,8(2):147-158.
2Abe M,Nakamura S,Shikano K,and Kuwabara H.Voice conversion through vector quantization.IEEE Proceedings of ICASSP,New York,USA,Apr.11-14,1988:565-568.
3Arslan L M.Speaker transformation algorithm using segmental codebooks.Speech Communication,1999,28(3):211-226.
4Narendranath M,Murthy H A,and Rajendran S,et al..Transformation of formants for voice conversion using artificial neural networks.Speech Communication,1995,16(2):207-216.
5Iwahashi N and Sagisaka Y.Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks.Speech Communication,1995,16(2):139-151.
6Stylianou Y,Cappe O,and Moulines E.Continuous Probabilistic Transform for Voice Conversion.IEEE Trans on Speech and Audio Processing,1998,6(2):131-142.
7Kain A and Macon M W.Spectral voice conversion for text-to-speech synthesis.IEEE Proceedings of ICASSP,Seattle,USA,May 12-15,1998:285-288.
8Smits R and Yegnanarayana B.Determination of instants of significant excitation in speech using group delay function.IEEE Trans.on Speech and Audio Processing,1995,3(5):325-333.
9Diakoloukas V D and Digalakis V V.Maximum likelihood stochastic transformation adaptation of hidden Markov models.IEEE Trans.on Speech and Audio Processing,1999,7(2):177-187.
10Wang T T.The segmented chirp z-transform and its application in spectrum analysis.IEEE Trans.on Instrumentation and Measurement,1990,39(2):318-324.

同被引文献19

1左国玉,刘文举,阮晓钢.一种使用声调映射码本的汉语声音转换方法[J].数据采集与处理,2005,20(2):144-149. 被引量：4
2Valbret H. Voice transformation using PSOLA technique [ J ]. Speech Communication, 1992,11 (23) : 175 - 187.
3Narendranath M. Transformation of formants for voice conversion using artificial neural networks [ J ]. Speech Communication, 1995,16 (2):207 -216.
4Stylianou Y. Continuous probabilistic transform for voice conversion [J].IEEE Transactions on Speech and Audio Processing, 1998,6 (2):131 - 142.
5Childers D G. Glottal source modeling for voice conversion [ J]. Speech Communication, 1995,16( 2 ) : 127 - 138.
6申毅,简志华,杨震.改进的GMM模型语声转换系统[J].南京邮电大学学报（自然科学版）,2007,27(5):11-15. 被引量：2
7简志华,杨震.语声转换技术发展及展望[J].南京邮电大学学报（自然科学版）,2007,27(6):88-94. 被引量：3
8双志伟,Raimo Bakis,秦勇.IBM Voice Conversion Systems for 2007 TC-STAR Evaluation[J].Tsinghua Science and Technology,2008,13(4):510-514. 被引量：2
9孙新建,张雄伟,杨吉斌,曹铁勇,孙健.基于隐变量模型的语音转换方法研究[J].信号处理,2012,28(3):344-351. 被引量：2
10简志华,王向文.一种用于语音转换的区域最近邻迭代训练算法[J].电子与信息学报,2012,34(9):2091-2096. 被引量：1

引证文献2

1丁耀娥,俞一彪.采用谱包络与超音段韵律调整的高自然度语音转换[J].苏州大学学报（工科版）,2009,29(4):10-15.
2张雄伟,苗晓孔,曾歆,孙蒙,曹铁勇.语音转换技术研究现状及展望[J].数据采集与处理,2019,34(5):753-770. 被引量：9

二级引证文献9

1潘梦鹞,吕小勇,陈少伟,郇锐铁,王锋.基于AI智能语音技术线上教学的创新与实践[J].创新创业理论研究与实践,2022(24):170-173. 被引量：1
2鲍薇,温正棋.声音伪造与防伪检测技术研究[J].信息技术与标准化,2020(1):54-58. 被引量：1
3李智诚,张云翔.面向电力行业的智能会议录音回溯系统[J].现代计算机,2020,26(21):37-39. 被引量：1
4张雄伟,李嘉康,孙蒙,郑琳琳.语音欺骗检测方法的研究现状及展望[J].数据采集与处理,2020,35(5):807-823. 被引量：10
5郑琳琳,张雄伟,孙蒙,李嘉康,张星昱.基于i⁃vector的电子伪装语音鲁棒还原方法研究[J].数据采集与处理,2020,35(5):880-891. 被引量：1
6张雄伟,张星昱,孙蒙,邹霞.说话人验证系统攻击方法的研究现状及展望[J].数据采集与处理,2021,36(5):831-849. 被引量：3
7杨帅,乔凯,陈健,王林元,闫镔.语音合成及伪造、鉴伪技术综述[J].计算机系统应用,2022,31(7):12-22. 被引量：8
8吕汝金,苏庚辰,徐永博.一种智能分类垃圾桶的设计研究[J].机械设计与制造,2022(7):232-234. 被引量：5
9孙丽丽,翟启,张延童,翟洪婷,张庆锐.基于声纹识别的电网调度认证系统设计[J].山东电力技术,2023,50(10):58-65.

1简志华,杨震.一种用于语声转换系统的LPC残差信号生成算法[J].信号处理,2008,24(5):762-765. 被引量：1
2简志华,杨震.基于维特比算法的语声转换[J].电子学报,2009,37(7):1470-1475. 被引量：2
3简志华,杨震.语声转换技术发展及展望[J].南京邮电大学学报（自然科学版）,2007,27(6):88-94. 被引量：3
4王策,翟葵,胡艳军.基于EM迭代算法的CDMA多用户检测[J].安徽大学学报（自然科学版）,2005,29(1):50-54. 被引量：1
5龚国勇.语音识别中AR模型的研究[J].数学的实践与认识,2008,38(22):142-146. 被引量：1
6李燕萍,张玲华,丁辉.基于音素分类的汉语语声转换算法[J].南京邮电大学学报（自然科学版）,2011,31(1):10-15. 被引量：1
7张蕊萍,张太镒.叠加导频在EM迭代信道估计算法中的应用[J].电子科技大学学报,2007,36(S2):1037-1040.
8陈宁,万茂文.基于分段线性预测技术提取语音谱包络[J].电声技术,2008,32(6):57-60.
9刘纯天,王归新,徐友仁.CCD 积分采样过程对散斑场平滑作用的研究[J].武汉水利电力大学（宜昌）学报,1998,20(2):51-54. 被引量：1
10吴龙梅,张建军,赵风光,张云雁.一类新的实时语音端点检测方法[J].上海大学学报（自然科学版）,2005,11(4):372-374. 被引量：2

电子与信息学报

2007年第7期

浏览历史

内容加载中请稍等...

基于混合线性变换的语声转换算法被引量：2

参考文献12

同被引文献19

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

基于混合线性变换的语声转换算法 被引量：2

参考文献12

同被引文献19

引证文献2

二级引证文献9

相关作者

相关机构

相关主题

浏览历史

基于混合线性变换的语声转换算法被引量：2