摘要
传统的语音增强算法通常是对含噪语音的幅值谱进行处理,并利用原始含噪语音的相位对增强后的语音进行重构。为改善现有算法的语音质量和实时性,本文提出一种基于复值掩蔽的扩张卷积网络对含噪语音进行实时增强处理。实验结果表明,本文提出的方法在保证算法实时性的同时可以显著提高语音的可懂度与质量。相比基线模型RNNoise,PESQ和STOI分别提升了0.32和0.09。
Traditional speech enhancement usually operates on the amplitude spectrum of noisy speech and reconstructs the enhanced speech by using the phase of original noisy speech.In order to further improve the quality of speech enhancement in real environment,a complex dilated convolution network for real-time application is proposed.The results show that the proposed method can significantly improve the intelligibility and quality of speech while keeping the real-time performance.Compared with the baseline model RNNoise,PESQ and STOI are increased by 0.32 and 0.09 respectively.
作者
朱明
孙世若
ZHU Ming;SUN Shiruo(School of Information Engineering,Yancheng Institute of Technology,Yancheng Jiangsu 224051,China;School of Information Science and Engineering,Southeast University,Nanjing Jiangsu 210096,China)
出处
《电子器件》
CAS
北大核心
2021年第3期612-615,共4页
Chinese Journal of Electron Devices
基金
国家自然科学基金项目(61673108)。
关键词
语音增强
扩张卷积
线性门控单元
相位
speech enhancement
dilated convolutions
gated linear units
phase