期刊文献+

基于语音个人特征信息分离的语音转换方法研究 被引量:4

A speech conversion method based on the separation of speaker-specific characteristics
下载PDF
导出
摘要 本文在深入研究语音个人特征信息有效表示的基础上,从信息分离角度,提出一种新的利用个人特征信息分离和替换实现语音转换的方法。该方法主要利用语音的稀疏性和K-均值奇异值分解(K-SVD)来实现。由于这种基于K-SVD的字典训练方法可以较好地保存语音信号中的个人特征信息,因此可以利用K-SVD的字典训练方法把语音个人特征信息进行分离并替换,再和语言内容等信息重构出目标语音。相对于传统方法,本方法能够更好地利用语音的稀疏性保存语音个人特征信息,从而可以克服参数映射带来的转换后语音个人特征相似度不高和语音质量下降的问题。实验仿真及主观评价结果表明,与基于高斯混合模型、人工神经网络的语音转换方法相比,该方法具有更好的转换语音质量和转换相似度以及抗噪性。 This paper aims to study independent and complete characterization of speaker-specific voice characteristics. Based on this, from the point of information separation, we will conduct a method on the separation between w)ice characteristics and linguistic content in speech, and carry out voice conversion. In this paper, we take full account of the K-SVD algorithm which can train the dictionary contains the personal characteristics and inter-frame correlation of voice. With this feature, the dietionary which contains the personal characteristics is extracted from training data through the K-SVD algo- rithm. Then we use the trained dictionary and other content information to reconstruct the target speech. Compared to tradi- tional methods, the personal characteristics can be better preserved based on the proposed method through the sparse nature of voice and the proposed method can easily solve the problems encountered in feature mapping methods as well as the voice conversion improvements are to be expected. Experimental results using subjective evaluations show that the proposed meth- od outperforms the Gaussian Mixture Model and Artificial Neural Network based methods in the view of both speech quality and conversion similarity with the better noise immunity to the target voice.
出处 《信号处理》 CSCD 北大核心 2013年第4期513-519,共7页 Journal of Signal Processing
基金 江苏省自然科学基金项目(BK2012510) 解放军理工大学预研项目(20110211)
关键词 语音转换 语音个人特征 信息分离 K—SVD Voice Conversion Speaker-specific characteristics Information separation K-SVD
  • 相关文献

参考文献3

二级参考文献29

  • 1双志伟,张世磊,秦勇.语音转换分析及相似度改进[J].清华大学学报(自然科学版),2009(S1):1408-1412. 被引量:3
  • 2Stylianou Y. Voice transformation: a survey [C]// IEEE International Conference on Acoustics, Speech and Signal Processing. China: IEEE, 2009: 3585- 3588.
  • 3Abe M, Nakamura S, Shikano K, et al. Voice con version through vector quantization [C]//IEEE In ternational Conference on Acoustics, Speech and Sig nal Processing. Seattle, Washington: IEEE, 1988 655-658.
  • 4Stylianou Y, Cappe O, Moulines E. Continuous probabilistic transform for voice conversion [J].IEEE Transactions on Speech and Audio Processing, 1998, 6(2): 131-142.
  • 5Yamagishi J, Kobayashi T, Nakano Y, et al. Analy- sis of speaker adaptation algorithms for HMM-based speech synthesis and a constrained SMAPLR adapta- tion algorithm [J]. IEEE Transactions on Audio, Speech and Language Processing, 2009, 17(1): 66- 83.
  • 6Erro D, Moreno A, Bonafonte A. Voice conversion based on weighted frequency warping[J]. IEEE Transactions on Audio, Speech and Language Pro- cessing, 2010, 18(5): 922-931.
  • 7Desai S, Black A W, Yegnanarayana B, et al. Spec- tral mapping using artificial neural networks for voice conversion [J]. IEEE Transactions on Audio, Speech and Language Processing, 2010, 18(5): 954-964.
  • 8Duxans H, Bonafonte A, Kain A, et al. Including dynamic and phonetic information in voice conversion systems [C]//8th International Conference on Spo- ken Language Processing. Jeju Island, Korea: [s. n. ], 2004: 5-8.
  • 9Toda T, Black A W, Tokuda K. Voice conversion based on maximum-likelihood estimation of spectral parameter trajectory [J]. IEEE Transactions on Au- dio, Speech and Language Processing, 2007, 15 (8): 2222-2235.
  • 10Zen H, Nankaku Y, Tokuda K. Continuous stochastic feature mapping based on trajectory HMMs [J]. IEEE Transactions on Audio, Speech and Language Processing, 2011, 19(2): 417-430.

共引文献20

同被引文献22

引证文献4

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部