期刊文献+

基于话者无关模型的说话人转换方法

Voice Conversion Based on Speaker Independent Model
下载PDF
导出
摘要 提出一种基于话者无关模型的说话人转换方法.考虑到音素信息共同存在于所有说话人的语音中,假设存在一个可以用高斯混合模型来描述的话者无关空间,且可用分段线性变换来描述该空间到各说话人相关空间之间的映射关系.在一个多说话人的数据库上,用话者自适应训练算法来训练模型,并在转换阶段使用源目标说话人空间到话者无关空间的变换关系来构造源与目标之间的特征变换关系,快速、灵活的构造说话人转换系统.通过主观测听实验来验证该算法相对于传统的基于话者相关模型方法的优点. A voice conversion method based on speaker independent (SI) model is proposed. Considering the phoneme information that commonly exists in every speaker's speech, an SI space described only by the phoneme information is assumed to exist. Gaussian mixture model (GMM) is adopted to model the distribution of the SI space, and the mapping relations from speaker dependent (SD) space to SI space are described by linear transformations. The SI model is trained by using speaker adaptive training (SAT) algorithm on a muhi-speaker database. In the conversion phase, the conversion function from source space to target space is quickly and flexibly built by joining the transformations from source space to SI space and SI space to target space. The advantage of the proposed method is proved by the results of some listening tests compared with two representative conventional methods.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2013年第3期254-259,共6页 Pattern Recognition and Artificial Intelligence
基金 国家自然科学基金资助项目(No.60905010)
关键词 说话人转换 话者无关模型 高斯混合模型 话者自适应训练 Voice Conversion, Speaker Independent Model, Gaussian Mixture Model, Speaker Adaptive Training
  • 相关文献

参考文献9

  • 1Abe M, Nakamura S,Shikano K, et al. Voice Conversion throughVector Quantization // Proc of the International Conference onAcoustic, Speech and Signal Processing. New York, USA, 1988:655-658.
  • 2Stylianou Y, CappeO, Moulines E. Continuous Probabilistic Trans-form for Voice Conversion. IEEE Trans on Speech and Audio Pro-cessing, 1998,6(2): 131-142.
  • 3Kain A, Macon M W. Spectral Voice Conversion for Text-to-SpeechSynthesis // Proc of the International Conference on Acoustic,Speech and Signal Processing. Seattle, USA, 1998: 285-288.
  • 4Ye H, Young S. Quality-Enhanced Voice Morphing Using MaximumLikelihood Transformations. IEEE Trans on Audio, Speech and Lan-guage Processing, 2006,14(4): 1301- 1312.
  • 5Toda T,Black A W, Tokuda K. Voice Conversion Based on Maxi-mum-Likelihood Estimation of Spectral Parameter Trajectory. IEEETrans on Audio, Speech and Language Processing, 2007 , 15 ( 8):2222-2235.
  • 6Ohtani Y, Toda T,Saruwatari H, et al. Many-to-Many EigenvoiceConversion with Reference Voice // Proc of Interspeech. Bri^iton,UK, 2009: 1623-1626.
  • 7Hershey J R, Olsen P A. Approximating the Kullback-Leibler Di-vergence between Gaussian Mixture Models // Proc of the Interna-tional Conference on Acoustics, Speech and Signal Processing. Ha-waii, USA, 2007, IV: 317-320.
  • 8Reynolds D A,Quatieri T F, Dunn R B. Speaker Verification UsingAdapted Gaussian Mixture Models. Digital Signal Processing, 2000,10(1/2/3): 19-41.
  • 9Gales M J F. Maximum Likelihood Linear Transformations for HMM-Based Speech Recognition. Computer Speech and Language,1998,12(2):75-98.

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部