基于状态空间模型的子频带语音转换算法被引量：6

Sub-Band Voice Morphing Algorithm Based on State-Space Model

下载PDF

导出

摘要语音转换是一项改变说话人声音特征的技术,该领域主流方法——基于高斯混合模型的全频带参数映射,会导致转换后的语音频谱产生帧间不连续性.本文针对以上问题提出了改进方案:首先引入状态空间模型来模拟语音动态变化特性,其次利用离散小波变换对语音低频和高频部分的参数分为子频带处理.文章最后用主观和客观实验对提出的算法进行的实验仿真和验证. Voice morphing is a technique to modify a source speaker＇s speech to sound as if it was spoken by solne desig-nated target speaker.The Gaussian mixture model（GMM）based transformations combined with full-band extracted feature paranle-ters have been commonly studied.However.these methods often introduce problems such as artifacts and discontinuities.In order to resolve the problem mentioned above,state-space model（SSM）iS first used to describe the relationship between the source speech and the target speech in the spectral domain.Then Discrete Wavelet Transform（DWT）is applied to decomtx）se speech signals into sub-bands in order to inlprove the quality ofthe converted speech.Finally,experiments using both objective and subjectivelneasure-ments ale conducted to validate the effectiveness ofthe proposed method.

作者徐宁杨震张玲华

机构地区南京邮电大学信号处理与传输研究院南京邮电大学通信与信息工程学院

出处《电子学报》 EI CAS CSCD 北大核心 2010年第3期646-653,共8页 Acta Electronica Sinica

基金国家863重点项目(No.2006AA010102) 国家自然科学基金(No.608721135 60971129) 江苏省普通高校研究生科研创新计划(No.CX08B_079Z)

关键词语音转换高斯混合模型状态空间模型全频带转换子频带转换 voice morphing Gaussian mixture model state-space model full-band conversion sub-band conversion

分类号 TN925 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献29

1ABE M, NAKAMURA S, SHIKANO K, KUWABARA H. Voice conversion through vector quantization[ A ]. Proceedings of International Conference on Acoustics, Speech, and Signal Processing[ C]. New York: IEEE Press, 1988. 655 - 658.
2SHIKANO K, NAKAMURA S, ABE M. Speaker adaptation and voice conversion by codebook mapping [ A ]. Proceedings of IEEE International Symposium on Circuits and Systems [C] .New York: IEEE Press, 1991.594 - 597.
3GUOYU Zuo, WENJU Liu, XIAOGANG Ruan. Genetic algorithm based RBF neural network for voice conversion[A]. Proceedings of IEEE World Congress on Intelligent Control and Automation[C] .New York:IEEE Press,2004. 4215 - 4218.
4IWAHASHI N, SAGISAKA Y. Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks [ J J. Speech Communication, 1995,16(2) : 139-151.
5STYLIANOU Y, CAPPE O. Continuous probabilistic transform for voice conversion[ J]. IEEE Transactions on Speech and Audio Processing, 1998,6(2) : 131 - 142.
6KAIN A.High Resolution Voice Transformation[D] .Portland: Oregon Health and Sci Univ,2001.
7HUI Ye, STEVE Young. Perceptually weighted linear transformations for voice conversion[ J ]. Eurospeech, 2003,8 (2) : 2409 -2412.
8KUN Liu. High quality voice conversion through combining modified GMM and formant mapping for Mandarin[ A]. Proceedings of International Conference on Digital Telecommunications[ C]. New York: IEEE. Press, 2007. 1038 - 1042.
9HIDEYUKI M, MASANOBU A. Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt[J]. Speech Communication, 1995,16(2 ) : 153- 164.
10KNAGENHJELM H, KLEIJN W. Spectral dynamics is more important than spectral distortion[ A]. Proceedings of International Conference on Acoustics, Speech, and Signal Processing [ C] .New York: IEEE Press, 1995. 732 - 735.

同被引文献50

1双志伟,张世磊,秦勇.语音转换分析及相似度改进[J].清华大学学报（自然科学版）,2009(S1):1408-1412. 被引量：3
2康永国,双志伟,陶建华,张维.基于混合映射模型的语音转换算法研究[J].声学学报,2006,31(6):555-562. 被引量：13
3Stylianou Y. Voice transformation: a survey [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. China: IEEE, 2009: 3585- 3588.
4Abe M, Nakamura S, Shikano K, et al. Voice con version through vector quantization [C]//IEEE In ternational Conference on Acoustics, Speech and Sig nal Processing. Seattle, Washington: IEEE, 1988 655-658.
5Stylianou Y, Cappe O, Moulines E. Continuous probabilistic transform for voice conversion [J].IEEE Transactions on Speech and Audio Processing, 1998, 6(2): 131-142.
6Yamagishi J, Kobayashi T, Nakano Y, et al. Analy- sis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adapta- tion algorithm [J]. IEEE Transactions on Audio, Speech and Language Processing, 2009, 17(1): 66- 83.
7Erro D, Moreno A, Bonafonte A. Voice conversion based on weighted frequency warping[J]. IEEE Transactions on Audio, Speech and Language Pro- cessing, 2010, 18(5): 922-931.
8Desai S, Black A W, Yegnanarayana B, et al. Spec- tral mapping using artificial neural networks for voice conversion [J]. IEEE Transactions on Audio, Speech and Language Processing, 2010, 18(5): 954-964.
9Duxans H, Bonafonte A, Kain A, et al. Including dynamic and phonetic information in voice conversion systems [C]//8th International Conference on Spo- ken Language Processing. Jeju Island, Korea: [s. n. ], 2004: 5-8.
10Toda T, Black A W, Tokuda K. Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory [J]. IEEE Transactions on Au- dio, Speech and Language Processing, 2007, 15 (8): 2222-2235.

引证文献6

1孙健,张雄伟,曹铁勇,杨吉斌,孙新建.基于卷积非负矩阵分解的语音转换方法[J].数据采集与处理,2013,28(2):141-148. 被引量：12
2马振,张雄伟,杨吉斌,徐玉龙.基于稀疏卷积非负矩阵分解的语音转换方法研究[J].军事通信技术,2013,34(2):1-7.
3张玲华,姚绍芹,解伟超.基于自适应粒子群优化径向基函数神经网络的语音转换[J].数据采集与处理,2015,30(2):336-343. 被引量：8
4李海燕,王程程,徐宁,胡芳.基于混合码书映射的高效语音转换方法[J].数据采集与处理,2016,31(3):512-524. 被引量：2
5车滢霞,俞一彪.约束条件下的结构化高斯混合模型及非平行语料语音转换[J].电子学报,2016,44(9):2282-2288. 被引量：3
6章子旭,简志华.采用双重交换表示分离的任意说话人语音转换[J].电子学报,2024,52(6):2141-2150.

二级引证文献24

1马振,张雄伟,杨吉斌.基于语音个人特征信息分离的语音转换方法研究[J].信号处理,2013,29(4):513-519. 被引量：3
2马振,张雄伟,杨吉斌,徐玉龙.基于稀疏卷积非负矩阵分解的语音转换方法研究[J].军事通信技术,2013,34(2):1-7.
3高新波,王笛,王秀美.一种潜在信息约束的非负矩阵分解方法[J].数据采集与处理,2014,29(1):11-18. 被引量：2
4姚绍芹,张玲华.基于GMM和ANN混合模型的语音转换方法[J].数据采集与处理,2014,29(2):227-231. 被引量：1
5张立伟,贾冲,张雄伟,闵刚,曾理.稀疏卷积非负矩阵分解的语音增强算法[J].数据采集与处理,2014,29(2):259-264. 被引量：13
6张倩敏,陶亮,周健,王华彬.非对称代价函数的稀疏卷积非负矩阵分解方法[J].信号处理,2015,31(1):95-102.
7陶智,曾晓亮,顾玲玲,张晓俊,吴迪,薛隆基.病理嗓音发声系统的非对称建模研究[J].数据采集与处理,2016,31(2):260-267. 被引量：2
8李海燕,王程程,徐宁,胡芳.基于混合码书映射的高效语音转换方法[J].数据采集与处理,2016,31(3):512-524. 被引量：2
9Liu Kai,Ge Zhishang,Xu Jiaqi,Gu Baotong,Wang Yangwei,Zhao Dongbiao.Kinematic Optimization of Bionic Shoulder Driven by Pneumatic Muscle Actuators Based on Particle Swarm Optimization[J].Transactions of Nanjing University of Aeronautics and Astronautics,2016,33(3):301-309. 被引量：3
10钟杰卓,杜文才,吴慰.无线传感器网络的系统化自适应建模[J].数据采集与处理,2016,31(4):832-837. 被引量：4

1宽带语音蓝牙音频平台[J].今日电子,2011(7):61-61.
2唐艺明.基于频谱的语音识别研究——互相关卷积部分[J].电子质量,2013(10):43-46.
3左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量：32
4李平勇,张修军,杨洪.嵌入式语音门锁的设计[J].成都大学学报（自然科学版）,2009,28(1):44-47. 被引量：1
5刘珍,陈前斌,刘艳.NGN和UMTS网络间的QoS参数映射[J].中国新通信,2007,9(1):80-82.
6王坤赤,蒋华.一种基于语音频谱的基频和共振峰提取算法[J].信息技术,2007,31(10):20-22. 被引量：2
7李强,明艳.语音频谱分析仿真系统的实现[J].科学咨询,2009(23):91-91. 被引量：1
8朱亚波.ATM与帧中继的互通[J].中国数据通信网络,2000,2(10):19-21.
9朱亚波.ATM与帧中继的互通[J].数字通信,2000,27(8):63-64.
10翟红刚,吴宇红.IP QoS网络中的SLA管理研究[J].中国数据通信,2003,5(11):40-43.

电子学报

2010年第3期

浏览历史

内容加载中请稍等...

基于状态空间模型的子频带语音转换算法被引量：6

参考文献29

同被引文献50

引证文献6

二级引证文献24

相关作者

相关机构

相关主题

浏览历史

基于状态空间模型的子频带语音转换算法 被引量：6

参考文献29

同被引文献50

引证文献6

二级引证文献24

相关作者

相关机构

相关主题

浏览历史

基于状态空间模型的子频带语音转换算法被引量：6