期刊文献+

一种基于码书映射的高效语音转换方法 被引量:2

A Highly Efficient Voice Conversion Method Based on Codebook Mapping
下载PDF
导出
摘要 为了使机器人在人-机语音交互过程中更为自然,利用语音转换技术改变源语音个性特征(机械音),进而变化为自然的目标人语音,是一种可行的方案。然而,当前的语音转换主流方法在实时性要求高且内核小的嵌入式机器人中并不适用。引入一种高效的改进型码书转换方法。该方法首先通过匹配线性谱频率参数的相对距离来求取加权系数,进而实现码字的预测重构;其次,对预测的码字进行带宽修正,克服频谱偏移问题。实验结果表明:该方法相比较传统方法,在转换性能相当的条件下,运行时间缩短75%左右。 In human -robot interaction, it is desired to have synthetic voices which sound natural and can be personalized for each user. One solution is to use voice conversion, in which the characteris- tics of a source mechanical voice are changed to produce a sound corresponding to a given target natural voice. However, the popular voice conversion method is computationally intensive, and not suitable for application in a robot with small kernel embedded. This paper introduces a high efficient improved segmental codebook conversion method. It firstly calculates the weighting coefficient by matching the relative distance of the Line Spectral Frequency (LSF) parameters to realize the prediction refactoring of code word. Secondly, the bandwidth correction for the predicted code word is used to solve the problem of spectrum shift. The test results show that the method is approximately 75% faster than the traditional Gaussian Mixture Model(GMM) under the comparative conversion performance.
出处 《微处理机》 2014年第1期65-69,共5页 Microprocessors
基金 国家自然科学基金(60905060) 中央高校基础研究项目(2011B11114 2012B07314 2012B04014) 教育部重点实验室开放基金(NYKL201305)
关键词 语音转换 嵌入式系统 谐波随机模型 分段码书 人机交互 Voice Conversion Embedded Systems Harmonic Stochastic Model SegmentalCodebook Man - machine Interaction
  • 相关文献

参考文献13

  • 1Wu C H,Hsia C C,Liu T H. Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis[J].IEEE Transactions on Audio Speech and Language Processing,2006,(04):1109-1116.
  • 2Zuo G,Liu W. Genetic algorithm based RBF neural network for voice conversion[A].IEEE,2004.4215-4218.
  • 3Desai S,Raghavendra E V,Yegnanarayana B. Voice conversion using artificial neural networks[A].2009.3893-3896.
  • 4Stylianou Y,CappéO,Moulines E. Continuous probabilistic transform for voice conversion[J].IEEE Transactions on Speech and Audio Processing,1998,(02):131-142.
  • 5Kain A B. High resolution voice transformation[D].Rockford College,2001.
  • 6Stylianou Y,Cappe O. A system for voice conversion based on probabilistic classification and a harmonic plus noise model[A].1998.281-284.
  • 7Arslan L M. Speaker transformation algorithm using segmental codebooks (STASC)[J].SPEECH COMMUNICATION,1999,(03):211-226.
  • 8Abe M,Nakamura S,Shikano K. Voice conversion through vector quantization[A].1988.655-658.
  • 9Erro D,Moreno A,Bonafonte A. Flexible harmonic/stochastic speech synthesis[A].2007.
  • 10Zhi-Hua J,Zhen Y. Voice conversion using Viterbi algorithm based on Gaussian mixture model[A].2007.32-35.

同被引文献11

  • 1Stylianou Y, Cappe O. A system for voice conversion based on probabilistic classification and a harmonic plus noise model [ C ]. IEEE International Conference on Acoustics, Speech and Signal Processing, 1998:281 - 284.
  • 2Wu C H, Hsia C C, Liu T H, et al. Voice conversion using duration- embedded bi -HMMs for expressive speech synthesis [ J . IEEE Transactions on Audio, Speech, and Language Processing,2006,14 ( 4 ) : 1109 - 1116.
  • 3Stylianou Y, Capp6 O, Moulines E. Continuous probabilis- tic transform for voice conversion [ J ]. IEEE Transactions on Speech and Audio Processing, 1998,6(2) : 131 - 142.
  • 4Kain A B. High resolution voice transformation [ D ]. Rockford College,2001.
  • 5Zuo G, Liu W. Genetic algorithm based RBF neural network for voice conversion [ C ]. Intelligent Control and Automation, 2004. Fifth World Congress on. IEEE, 2004,5:4215 - 4218.
  • 6Desai S, Raghavendra E V,Yegnanarayana B, et al. Voice conversion using artificial neural networks [ C 1. IEEE International Conference on Acoustics, Speech and Signal Processing, 2009 : 3893 - 3896.
  • 7Erro D, Moreno A, Bonafonte A. Flexible harmonic/ stochastic speech synthesis [ C ]. 6th ISCA Workshop on Speech Synthesis, 2007.
  • 8Toda T, Black A W, Tokuda K. Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter [ C ].//ICASSP 2005 (1):9-12.
  • 9Ye H, Young S. High quality voice morphing [ C ]./// Acoustics, Speech, and Signal Processing, 2004. Proceedings. ( ICASSP 04 ). IEEE International Confer- ence on. IEEE,2004( 1 ) :1-9- 12.
  • 10Turk O, Arslan L M. Robust processing techniques for voice conversion [ J ]. Computer Speech & Language, 2006,20(4) :441 -467.

引证文献2

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部