基于话者无关模型的说话人转换方法

Voice Conversion Based on Speaker Independent Model

下载PDF

导出

摘要提出一种基于话者无关模型的说话人转换方法.考虑到音素信息共同存在于所有说话人的语音中,假设存在一个可以用高斯混合模型来描述的话者无关空间,且可用分段线性变换来描述该空间到各说话人相关空间之间的映射关系.在一个多说话人的数据库上,用话者自适应训练算法来训练模型,并在转换阶段使用源目标说话人空间到话者无关空间的变换关系来构造源与目标之间的特征变换关系,快速、灵活的构造说话人转换系统.通过主观测听实验来验证该算法相对于传统的基于话者相关模型方法的优点. A voice conversion method based on speaker independent （SI） model is proposed. Considering the phoneme information that commonly exists in every speaker＇s speech, an SI space described only by the phoneme information is assumed to exist. Gaussian mixture model （GMM） is adopted to model the distribution of the SI space, and the mapping relations from speaker dependent （SD） space to SI space are described by linear transformations. The SI model is trained by using speaker adaptive training （SAT） algorithm on a muhi-speaker database. In the conversion phase, the conversion function from source space to target space is quickly and flexibly built by joining the transformations from source space to SI space and SI space to target space. The advantage of the proposed method is proved by the results of some listening tests compared with two representative conventional methods.

作者陈凌辉凌震华戴礼荣

机构地区中国科学技术大学语音及语言信息处理国家工程实验室

出处《模式识别与人工智能》 EI CSCD 北大核心 2013年第3期254-259,共6页 Pattern Recognition and Artificial Intelligence

基金国家自然科学基金资助项目(No.60905010)

关键词说话人转换话者无关模型高斯混合模型话者自适应训练 Voice Conversion, Speaker Independent Model, Gaussian Mixture Model, Speaker Adaptive Training

分类号 TN912.34 [电子电信—通信与信息系统]

引文网络
相关文献

参考文献9

1Abe M, Nakamura S,Shikano K, et al. Voice Conversion throughVector Quantization // Proc of the International Conference onAcoustic, Speech and Signal Processing. New York, USA, 1988:655-658.
2Stylianou Y, CappeO, Moulines E. Continuous Probabilistic Trans-form for Voice Conversion. IEEE Trans on Speech and Audio Pro-cessing, 1998,6(2): 131-142.
3Kain A, Macon M W. Spectral Voice Conversion for Text-to-SpeechSynthesis // Proc of the International Conference on Acoustic,Speech and Signal Processing. Seattle, USA, 1998: 285-288.
4Ye H, Young S. Quality-Enhanced Voice Morphing Using MaximumLikelihood Transformations. IEEE Trans on Audio, Speech and Lan-guage Processing, 2006,14(4): 1301- 1312.
5Toda T,Black A W, Tokuda K. Voice Conversion Based on Maxi-mum-Likelihood Estimation of Spectral Parameter Trajectory. IEEETrans on Audio, Speech and Language Processing, 2007 , 15 ( 8):2222-2235.
6Ohtani Y, Toda T,Saruwatari H, et al. Many-to-Many EigenvoiceConversion with Reference Voice // Proc of Interspeech. Bri^iton,UK, 2009: 1623-1626.
7Hershey J R, Olsen P A. Approximating the Kullback-Leibler Di-vergence between Gaussian Mixture Models // Proc of the Interna-tional Conference on Acoustics, Speech and Signal Processing. Ha-waii, USA, 2007, IV: 317-320.
8Reynolds D A,Quatieri T F, Dunn R B. Speaker Verification UsingAdapted Gaussian Mixture Models. Digital Signal Processing, 2000,10(1/2/3): 19-41.
9Gales M J F. Maximum Likelihood Linear Transformations for HMM-Based Speech Recognition. Computer Speech and Language,1998,12(2):75-98.

1黄盈椿,王欢良,冯涛.应用MAP方差估计的话者自适应训练方法[J].计算机工程,2006,32(20):203-204.
2朱洪伟,唐小明,何友.基于航迹无关模型的传感器系统误差可观测性分析[J].雷达学报（中英文）,2013,2(4):454-460. 被引量：2
3潘渊.声音转换及相关技术的研究[J].今日科苑,2010(22):113-113. 被引量：1
4李国强,杜利民.语音识别的话者自适应研究[J].电子科技导报,1999(9):21-24.
5工信部部长苗圩鼓励TD－LTE走出去[J].互联网天地,2012(5):8-8.
6张炳,俞一彪.基于改进GMM和韵律联合短时谱的说话人转换[J].信号处理,2009,25(4):548-552. 被引量：2
7吕声,尹俊勋.同语种说话人转换的实现[J].移动通信,2004,0(S3):24-27.
8戴蓓倩,郁正庆,戴任飞,张劲松,王长富,司虎.基于话者分类和HMM的话者自适应语音识别[J].中国科学技术大学学报,1996,26(2):147-153. 被引量：2
9吕声,尹俊勋,黄建成.基于高斯混合模型和残差预测的说话人转换系统[J].电声技术,2004,28(6):33-36. 被引量：4
10陈大为,吴朝晖,杨莹春.一种适用于远程电话音识别的自适应建模方法(英文)[J].广西师范大学学报（自然科学版）,2003,21(A01):185-190.

模式识别与人工智能

2013年第3期

浏览历史

内容加载中请稍等...

基于话者无关模型的说话人转换方法

参考文献9

相关作者

相关机构

相关主题

浏览历史