期刊文献+

基于状态空间模型的子频带语音转换算法 被引量:6

Sub-Band Voice Morphing Algorithm Based on State-Space Model
下载PDF
导出
摘要 语音转换是一项改变说话人声音特征的技术,该领域主流方法——基于高斯混合模型的全频带参数映射,会导致转换后的语音频谱产生帧间不连续性.本文针对以上问题提出了改进方案:首先引入状态空间模型来模拟语音动态变化特性,其次利用离散小波变换对语音低频和高频部分的参数分为子频带处理.文章最后用主观和客观实验对提出的算法进行的实验仿真和验证. Voice morphing is a technique to modify a source speaker's speech to sound as if it was spoken by solne desig-nated target speaker.The Gaussian mixture model(GMM)based transformations combined with full-band extracted feature paranle-ters have been commonly studied.However.these methods often introduce problems such as artifacts and discontinuities.In order to resolve the problem mentioned above,state-space model(SSM)iS first used to describe the relationship between the source speech and the target speech in the spectral domain.Then Discrete Wavelet Transform(DWT)is applied to decomtx)se speech signals into sub-bands in order to inlprove the quality ofthe converted speech.Finally,experiments using both objective and subjectivelneasure-ments ale conducted to validate the effectiveness ofthe proposed method.
出处 《电子学报》 EI CAS CSCD 北大核心 2010年第3期646-653,共8页 Acta Electronica Sinica
基金 国家863重点项目(No.2006AA010102) 国家自然科学基金(No.608721135 60971129) 江苏省普通高校研究生科研创新计划(No.CX08B_079Z)
关键词 语音转换 高斯混合模型 状态空间模型 全频带转换 子频带转换 voice morphing Gaussian mixture model state-space model full-band conversion sub-band conversion
  • 相关文献

参考文献29

  • 1ABE M, NAKAMURA S, SHIKANO K, KUWABARA H. Voice conversion through vector quantization[ A ]. Proceedings of International Conference on Acoustics, Speech, and Signal Processing[ C]. New York: IEEE Press, 1988. 655 - 658.
  • 2SHIKANO K, NAKAMURA S, ABE M. Speaker adaptation and voice conversion by codebook mapping [ A ]. Proceedings of IEEE International Symposium on Circuits and Systems [C] .New York: IEEE Press, 1991.594 - 597.
  • 3GUOYU Zuo, WENJU Liu, XIAOGANG Ruan. Genetic algorithm based RBF neural network for voice conversion[A]. Proceedings of IEEE World Congress on Intelligent Control and Automation[C] .New York:IEEE Press,2004. 4215 - 4218.
  • 4IWAHASHI N, SAGISAKA Y. Speech spectrum conversion based on speaker interpolation and multi-functional representation with weighting by radial basis function networks [ J J. Speech Communication, 1995,16(2) : 139-151.
  • 5STYLIANOU Y, CAPPE O. Continuous probabilistic transform for voice conversion[ J]. IEEE Transactions on Speech and Audio Processing, 1998,6(2) : 131 - 142.
  • 6KAIN A.High Resolution Voice Transformation[D] .Portland: Oregon Health and Sci Univ,2001.
  • 7HUI Ye, STEVE Young. Perceptually weighted linear transformations for voice conversion[ J ]. Eurospeech, 2003,8 (2) : 2409 -2412.
  • 8KUN Liu. High quality voice conversion through combining modified GMM and formant mapping for Mandarin[ A]. Proceedings of International Conference on Digital Telecommunications[ C]. New York: IEEE. Press, 2007. 1038 - 1042.
  • 9HIDEYUKI M, MASANOBU A. Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt[J]. Speech Communication, 1995,16(2 ) : 153- 164.
  • 10KNAGENHJELM H, KLEIJN W. Spectral dynamics is more important than spectral distortion[ A]. Proceedings of International Conference on Acoustics, Speech, and Signal Processing [ C] .New York: IEEE Press, 1995. 732 - 735.

同被引文献50

  • 1双志伟,张世磊,秦勇.语音转换分析及相似度改进[J].清华大学学报(自然科学版),2009(S1):1408-1412. 被引量:3
  • 2康永国,双志伟,陶建华,张维.基于混合映射模型的语音转换算法研究[J].声学学报,2006,31(6):555-562. 被引量:13
  • 3Stylianou Y. Voice transformation: a survey [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. China: IEEE, 2009: 3585- 3588.
  • 4Abe M, Nakamura S, Shikano K, et al. Voice con version through vector quantization [C]//IEEE In ternational Conference on Acoustics, Speech and Sig nal Processing. Seattle, Washington: IEEE, 1988 655-658.
  • 5Stylianou Y, Cappe O, Moulines E. Continuous probabilistic transform for voice conversion [J].IEEE Transactions on Speech and Audio Processing, 1998, 6(2): 131-142.
  • 6Yamagishi J, Kobayashi T, Nakano Y, et al. Analy- sis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adapta- tion algorithm [J]. IEEE Transactions on Audio, Speech and Language Processing, 2009, 17(1): 66- 83.
  • 7Erro D, Moreno A, Bonafonte A. Voice conversion based on weighted frequency warping[J]. IEEE Transactions on Audio, Speech and Language Pro- cessing, 2010, 18(5): 922-931.
  • 8Desai S, Black A W, Yegnanarayana B, et al. Spec- tral mapping using artificial neural networks for voice conversion [J]. IEEE Transactions on Audio, Speech and Language Processing, 2010, 18(5): 954-964.
  • 9Duxans H, Bonafonte A, Kain A, et al. Including dynamic and phonetic information in voice conversion systems [C]//8th International Conference on Spo- ken Language Processing. Jeju Island, Korea: [s. n. ], 2004: 5-8.
  • 10Toda T, Black A W, Tokuda K. Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory [J]. IEEE Transactions on Au- dio, Speech and Language Processing, 2007, 15 (8): 2222-2235.

引证文献6

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部