摘要
目前的语音合成技术越来越成熟,而对于合成语音质量的度量还没有比较好的客观评价体系。文中在传统的主观评分体系基础上,分析了影响合成语音质量的关键因素,利用深度学习方法建立了合成语音质量评价系统,对汉语合成语音的自然度进行客观评价。该方法得到的语音质量评分结果与人工打分的主观评价结果相比,五分制条件下的均方根误差为0.4分,相关系数为0.68。
Although the speech synthesis technology is more and more mature,there is no fairly efficient evaluation system to assess the synthetic speech quality. Based on the traditional subjective scoring system,this paper analyzes the key factors affecting the quality of synthesized speech,and establishes a synthetic speech quality evaluation system by using deep learning method to objectively evaluate the naturalness of Chinese synthesized speech. The test results show that,compared with the artificial subjective evaluation results,the RMSE of the system output is 0. 4/5 and the correlation coefficient is 0. 68.
作者
汤梦
朱杰
TANG Meng;ZHU Jie(School of Electronic Information and Electrical Engineering, Shanghai Jiaotong University, Shanghai 200240,China)
出处
《信息技术》
2019年第5期41-44,共4页
Information Technology
基金
科技部重点专项课题(2017YFF0210903)
关键词
合成语音
自然度
客观评价
LSTM
synthetic speech
naturalness
objective evaluation
LSTM