摘要
语音转换是一项改变说话人声音特征的技术,该领域主流方法——基于高斯混合模型的全频带参数映射,会导致转换后的语音频谱产生帧间不连续性.本文针对以上问题提出了改进方案:首先引入状态空间模型来模拟语音动态变化特性,其次利用离散小波变换对语音低频和高频部分的参数分为子频带处理.文章最后用主观和客观实验对提出的算法进行的实验仿真和验证.
Voice morphing is a technique to modify a source speaker's speech to sound as if it was spoken by solne desig-nated target speaker.The Gaussian mixture model(GMM)based transformations combined with full-band extracted feature paranle-ters have been commonly studied.However.these methods often introduce problems such as artifacts and discontinuities.In order to resolve the problem mentioned above,state-space model(SSM)iS first used to describe the relationship between the source speech and the target speech in the spectral domain.Then Discrete Wavelet Transform(DWT)is applied to decomtx)se speech signals into sub-bands in order to inlprove the quality ofthe converted speech.Finally,experiments using both objective and subjectivelneasure-ments ale conducted to validate the effectiveness ofthe proposed method.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2010年第3期646-653,共8页
Acta Electronica Sinica
基金
国家863重点项目(No.2006AA010102)
国家自然科学基金(No.608721135
60971129)
江苏省普通高校研究生科研创新计划(No.CX08B_079Z)
关键词
语音转换
高斯混合模型
状态空间模型
全频带转换
子频带转换
voice morphing
Gaussian mixture model
state-space model
full-band conversion
sub-band conversion