摘要
针对卷积编解码网络(CED,Convolution encoder-and-decoder)对语音时序相关信息捕获困难的问题,本文提出了一种基于门控残差卷积编解码网络的语音增强方法。该方法在卷积编解码网络的基础上引入了门控机制、膨胀卷积与残差连接:门控机制能够很好地处理序列前后相关信息;膨胀卷积使得卷积过程获得更大的感受野,提取更加丰富的全局信息;残差连接能够防止梯度消失与梯度爆炸,提升网络精度。此外,采用频域损失函数与时域评价指标联合优化的策略对网络进行训练,以进一步提升网络增强效果。实验表明,在匹配噪声和不匹配噪声下,相比于基线CED与其他对比方法,本文方法取得了更高的PESQ、STOI与SI-SDR,对语音的清浊音都有较好恢复效果,且具有较强的泛化能力。
In order to solve the problem that it is difficult for Convolution Encoder-and-Decoder(CED)network to capture temporal related contexts of speech,a speech enhancement method based on gated residuals convolution encoder-and-decoder network is proposed.Based on CED,this proposed method introduces the gating mechanism,dilated convolution and residual connection to the network:The gating mechanism can well handle the relevant contexts of sequence;Dilated convolution makes the convolution process obtain larger receptive field and extract more abundant global information;Residual connection can prevent vanishing gradient and exploding gradient and improve network accuracy.In addition,the combined optimization strategy of frequency-domain loss function and time-domain evaluation index is adopted to train the network to further improve the enhancement effect of propose network.Experimental results show that,compared with the baseline CED and other comparison methods,the proposed method achieves higher PESQ,STOI and SI-SDR under matched noise and mismatched noise,and it has a good recovery effect on the voiceless and voiced sounds of speech and has strong generalization ability.
作者
张天骐
柏浩钧
叶绍鹏
刘鉴兴
ZHANG Tianqi;BAI Haojun;YE Shaopeng;LIU Jianxing(School of Communication and Information Engineering,Chongqing Key Laboratory of Signal and Information Processing(CQKLS&IP),Chongqing University of Posts and Telecommunications(CQUPT),Chongqing 400065,China)
出处
《信号处理》
CSCD
北大核心
2021年第10期1986-1995,共10页
Journal of Signal Processing
基金
国家自然科学基金项目(61671095,61702065,61701067,61771085)
信号与信息处理重庆市市级重点实验室建设项目(CSTC2009CA2003)
重庆市研究生科研创新项目(CYS19248)
重庆市教育委员会科研项目(KJ1600427,KJ1600429)。
关键词
语音增强
门控机制
卷积编解码网络
残差连接
speech enhancement
gating mechanism
convolution encoder-and-decoder network
residual connection