期刊文献+

基于多谱特征生成对抗网络的语音转换算法 被引量:4

A voice conversion algorithm based on multi-spectral feature generative adversarial network
下载PDF
导出
摘要 语音转换在教育、娱乐、医疗等各个领域都有广泛的应用,为了得到高质量的转换语音,提出了基于多谱特征生成对抗网络的语音转换算法。利用生成对抗网络对由谱特征参数生成的声纹图进行转换,利用特征级多模态融合技术使网络学习来自不同特征域的多种信息,以提高网络对语音信号的感知能力,从而得到具有良好清晰度和可懂度的高质量转换语音。实验结果表明,在主、客观评价指标上,本文算法较传统算法均有明显提升。 Voice conversion is widely used in education,entertainment,medical and other fields.In order to obtain high-quality converted speech,this paper proposes a voice conversion algorithm based on multi-spectral feature generative adversarial network.It uses generative adversarial network to convert the voiceprint obtained by spectral feature parameters.The feature-level multimodal fusion technique is used to make the network learn multiple spectral feature information from different feature domains,so as to improve the perception of speech signals of the network.Finally,the high-quality converted speech with good definition and intelligibility is obtained.The experimental results show that the proposed algorithm is significantly superior to the traditional algorithms in the subjective and objective evaluation indicators.
作者 张筱 张巍 王文浩 万永菁 ZHANG Xiao;ZHANG Wei;WANG Wen-hao;WAN Yong-jing(School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China)
出处 《计算机工程与科学》 CSCD 北大核心 2020年第5期893-901,共9页 Computer Engineering & Science
关键词 语音转换 声纹图 生成对抗网络 多谱特征 跨域重建误差 voice conversion voiceprint generative adversarial network multi-spectral feature cross-domain reconstruction error
  • 相关文献

参考文献2

二级参考文献17

  • 1ABE M, NAKAMURA S, SHIKANO K, et al. Voice conversion through vector quantization[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vol 1. New York :IEEE Press, 1988 : 655-658.
  • 2TODA T, BLACK A W, TOKUDA K. Spectral conversion based on maximum likelihood estimation considering global variance of converted parameter[C]//Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Vol 1. Philadelphia:IEEE Press, 2005:9-12.
  • 3ERRO Daniel , MORENO Asuncion. Voice conversion based on weighted frequency warping[C]//Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2009.Barcelona: IEEE Press,2009:922-931.
  • 4NARENDRANATH M, MURTHY H A, RAJENDRAN S, et al. Transformation of formants for voice conversion using artificial neural networks[J]. Speech Communication, 1995, 16:207-216.
  • 5BAUDOIN G, STYLINAOU Y. On the transformation of the speech spectrum for voice conversion[C]//Proeeedings of ICSLP'96, Vol 3. Philadelphia:IEEE Press. 1996:1405- 1408.
  • 6WATANABE T, MURAKAMI T, NAMBA M, et al. Transformation of spectral envelope for voice conversion based on radial basis function network[C]//Proceedings of International Conference on Spoken Language Processing, 2002. Denver: IEEE Press,2002: 285-288.
  • 7DESAI S, RAGHAVENDRA E V, YEGNANARAYANA B, et al.Voice conversion using artificial neural networks [C]// Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, 2009.Taipei:IEEE Press, 2009:3893-3896.
  • 8IRINO T, MINAMI Y, NAKATANI T, et al. Evaluation of a speech recognition/generation method based on hmm and straight[C]//Proceedings of the ICSLP, 2002. Dunedin:IEEE Press, 2002.
  • 9KAWAHARA H. Speech representation and transformation using adaptive interpolation of weighted spectrum: vocoder revisited[C]//Technical Report of IEICE. Wakayama:[s.n.], 1996:9-16.
  • 10KAWAHARA H. Restructuring speech representations using a pitch adaptive time frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds[J]. Speech Communication, 1999,2:1303-1306.

共引文献7

同被引文献31

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部