摘要
语音合成技术在人机交互中扮演着重要角色,深度学习的发展带动语音合成技术高速发展。基于深度学习的语音合成技术在合成语音的质量和速度上都超过了传统语音合成技术。从基于深度学习的声码器和声学模型出发对语音合成技术进行综述,探讨各类声码器和声学模型的工作原理及其优缺点,在此基础上对语音合成系统进行综述,系统综述经典的基于深度学习的语音合成系统,对基于深度学习的语音合成技术进行展望。
Speech synthesis technology plays an important role in human-machine interaction.The development of deep learning drives the rapid development of speech synthesis technology.Speech synthesis technology based on deep learning surpasses traditional speech synthesis technology in both quality and speed.This paper reviews speech synthesis technology based on deep learning vocoders and acoustic models,discusses the working principles and advantages and disadvantages of various vocoders and acoustic models,and then summarizes the speech synthesis system,systematically reviews the classic speech synthesis system based on deep learning,and finally looks forward to the speech synthesis technology based on deep learning.
作者
张小峰
谢钧
罗健欣
杨涛
ZHANG Xiaofeng;XIE Jun;LUO Jianxin;YANG Tao(Command&Control Engineering College,Army Engineering University of PLA,Nanjing 210007,China;Unit 31121 of PLA,China)
出处
《计算机工程与应用》
CSCD
北大核心
2021年第9期50-59,共10页
Computer Engineering and Applications
基金
国家部委科技基金
江苏省自然科学基金青年基金项目(BK20150722)。