摘要
目前合成语音的自然度有待提高,论文根据目前的研究现状提出了一种合成语音自然度的客观评价方法,该方法主要从语音韵律特征的主要参数出发,计算同一发音人的自然语音和合成语音之间的基频、时长、音强等参数的差距,其中由于两种语音基频时间不匹配,所以采用DTW(Dynamic Time Warping)算法来对两种语音的基频进行了时间弯折对准。最后再将计算结果与主观评测(MOS)的结果进行比较。实验数据表明,论文提出的基频曲线失真测度与MOS之间具有很强的相关性,从韵律特征角度给出的评价结果能够衡量合成语音的自然度。
In this paper,a new objective evaluation method of naturalness for concatenate speech synthesis is proposed.Considering the prosodic parameters of speech,the objective distance of patch parameters,duration parameters and intensity parameters between the natural speech and the synthesized speech are calculated.For mismatch of two speeches in duration,the DTW(Dynamic Time Warping) algorithm is used to allow approximate matching.The formal Mean Opinion Score(MOS) obtained subjectively is compared with the result.The correlation coefficient between the objective measure and subjective measure is strong.The experiments show that the proposed method can serve as the objective evaluation of naturalness for concatenate speech synthesis.
出处
《计算机工程与应用》
CSCD
北大核心
2005年第7期32-33,152,共3页
Computer Engineering and Applications
基金
国家自然科学基金项目(编号:60275014)
关键词
语音合成
评测
自然度
speech synthesis,evaluation,naturalness