摘要
针对传统参数再合成语音增强算法中使用单一声学特征进行预测以及非神经网络声码器进行语音合成系统增强性能较差的问题,提出一种基于多特征融合的参数再合成语音增强算法。通过结合注意力机制进行多种声学特征融合,采用融合后的综合特征代替单一特征预测干净语音声学特征;在此基础上,使用神经网络声码器WaveNet声码器合成高质量干净语音。在TIMIT和NOISEX-92语料库上进行实验,实验结果表明,该算法较对比方法得到了更好的增强效果,语音质量和语音可懂度都有相应提高。
In view of the problem that the traditional Parametric Resynthesis speech enhancement algorithm has poor enhancement performance because of a single acoustic feature for prediction and the non-neural network vocoder for speech synthesis,a Parametric Resynthesis speech enhancement algorithm based on multi-feature fusion is proposed.The algorithm integrates various acoustic features by combining attention mechanism,the integrated features after fusion are used instead of the single feature to predict the acoustic features of clean speech.On this basis,the neural network vocoder WaveNet vocoder is used to synthesize speech.Experiments on the TIMIT and Noisex-92 corpora show that the Parametric Resynthesis speech enhancement algorithm based on multi-feature fusion has better enhancement effect than comparative methods,and the speech quality and speech intelligibility are improved correspondingly.
作者
郑晨颖
马建芬
张朝霞
ZHENG Chen-ying;MA Jian-fen+;ZHANG Chao-xia(College of Information and Computer,Taiyuan University of Technology,Jinzhong 030600,China;College of Physics and Optoelectronics,Taiyuan University of Technology,Jinzhong 030600,China)
出处
《计算机工程与设计》
北大核心
2023年第8期2367-2373,共7页
Computer Engineering and Design
基金
山西省重点研发计划(高新技术领域)基金项目(201803D121057)
山西省回国留学人员科研基金项目(2017-031)。