期刊文献+

考虑帧间信息的语音转换算法

A Voice Conversion Algorithm Considering for Inter-frame Information
下载PDF
导出
摘要 传统的加权频率卷绕算法是单独地对每帧语音特征参数进行转换,没有考虑到语音帧前后的相关信息。针对这一点,该文提出了一种改进的加权频率卷绕算法,它利用压缩感知理论提取语音信号的帧间相关信息。在进行转换时,该算法是相当于对语音段进行转换。客观测试和主观听觉评测表明,虽然改进后算法的性能会受到语音段长度的影响,但当选择合适语音段长度时,性能要好于传统的加权频率卷绕算法。 The traditional conversion algorithm, weighted frequency warping( WFW), converted the speaker identity feature frame-by-frame and did not take account of the contextual information existing over a speech sequence. To solve the problem, this paper proposed a modified version of the WFW called modified weighted frequency warping(MWFW) which utilized compressed sensing(CS) to capture the useful information between continuous frames. Instead of transforming the speech features frame-independently, the MWFW did it seg- ment-by-segment. Both object and subject evaluations were conducted. The experimental results demonstrated that the performance of MWFW was dependent on the length of speech segment. When choosing the appropri- ate length of speech segment, our approach can achieve better performance than WFW.
出处 《杭州电子科技大学学报(自然科学版)》 2012年第4期33-36,共4页 Journal of Hangzhou Dianzi University:Natural Sciences
基金 浙江省自然科学基金资助项目(Y1101040) 浙江省教育厅科研资助项目(Y201016542)
关键词 语音转换 压缩感知 频率卷绕 高斯混合模型 voice conversion compressed sensing frequency warping GMM
  • 相关文献

参考文献9

  • 1左国玉,刘文举,阮晓钢.声音转换技术的研究与进展[J].电子学报,2004,32(7):1165-1172. 被引量:32
  • 2Abe M, Nakamura S, Shikano K, et al. Voice conversion through vector quantization [ C ]. New York: IEEE Internation- al Conference on Acoustic Speech and Signal Processing, 1988:655 -658.
  • 3Stylianou Y, Cappe O, Moulines E. Continuous probabilistic transform for voice conversion [ J ]. IEEE Transactions on Speech and Audio Processing, 1998, 6(2) :131 -142.
  • 4Kain A, Macon M W. Design and evaluation of a voice conversion algorithm based on spectral envelop mapping and resid- ual prediction [ C ]. Salt Lake City: IEEE International Conference on Acoustic Speech and Signal Processing, 2001:813 -816.
  • 5Pribilova A, Pribil J. Non-linear frequency scale mapping voice conversion in text-to-speech system with cepstral descrip- tion [ J ]. Speech Communication, 2006, 48 (12) : 1 691 - 1 703.
  • 6Erro D, Moreno A, Bonafonte A. Voice conversion based on weighted frequency warping [ J ]. IEEE Transactions on Au- dio Speech and Language Processing, 2010, 18(5) :922-931.
  • 7Tropp J A, Gilbert A C. Signal recovery from random measurements via orthogonal matching pursuit [ J ]. IEEE Transac- tions on Information Theory, 2007, 53 (12) :4 655 -4 666.
  • 8Kawahara H, Masuda-Katsuse I, Cheveigne A. Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds [ J ]. Speech Communication, 1999, 27 (3) : 187 - 207.
  • 9Ye Hui, Young S. Quality-enhanced voice morphing using maximum likelihood transformations [ J ]. IEEE Transactions on Audio Speech and Language Processing, 2006, 14(4) : 1 301 - 1 312.

二级参考文献56

  • 1H Kuwabara and Y Sagisaka.Acoustic characteristics of speaker individuality:control and conversion[J].Speech Communication.1995,16(2):165-173.
  • 2D Klatt and L C Klatt.Analysis,synthesis,and perception of voice quality variations among female and male talkers[J].J Acoust Soc Am,1990,87(2):820-857.
  • 3P H Milenkovic.Voice source model for continuous control of pitch period[J].J Acoust Soc Am,1993,93(2):1087-1096.
  • 4H Matsumoto,et al.Multidimensional representation of personal quality of vowels and its acoustical correlates[J].IEEE Trans Audio and Electroacoustics,1973,21(5):428-436.
  • 5S Furui.Research on individuality features in speech waves and automatic speaker recognition techniques [J].Speech Communication,1986,5(2):183-197.
  • 6K S Lee,et al.A new voice transformation based on both linear and nonlinear prediction[A].Proc ICSLP[C].Philadelphia,USA:ESCA,1996.1401-1404.
  • 7L M Arslan.Speaker transformation algorithm using segmental codebooks (STASC)[J].Speech Communication,1999,28(3):211-226.
  • 8H Mizuno and M Abe.Voice conversion algorithm based on piecewise linear conversion rules of formant frequency and spectrum tilt[J].Speech Communication.1995,16(2):165-173.
  • 9T Yoshimura,et al.Speaker interpolation in HMM-based speech synthesis system[A].Proc.Eurospeech [C].Rhodes,Greece:ESCA,1997.2523-2526.
  • 10D G Childers.Glottal source modeling for voice conversion [J].Speech Communication.1995,16 (2):127-138.

共引文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部