摘要
全卷积时域音频分离网络(Conv-TasNet)是近年提出的一种主流的端对端语音分离模型。Conv-TasNet利用膨胀卷积扩大感受野,使其在空间上可以融合更多语音特征,极大地提高了网络的语音分离性能,但同时忽略了信息在不同卷积通道间的重要性。基于此,提出一种基于超轻量通道注意力的端对端语音增强方法,该方法结合了Conv-TasNet和通道注意力,并在Conv-TasNet编解码器部分增加一组滤波器来提高网络语音特征提取能力,使卷积神经网络可以更有效地结合空间信息和通道信息来提高语音增强效果。实验验证了所提方法的模型容量在只增加了约0.02%的情况下,语音增强性能获得了有效提升。
The full convolutional time-domain audio separation network(Conv-TasNet)is a state-of-the-art end-to-end speech separation model which was proposed recently.The Conv-TasNet used dilated convolution to expand the recep-tive field and fuse more speech features in space,which greatly improved the speech separation performance of the net-work,but at the same time ignored the importance of information across different convolution channels.An end-to-end speech enhancement method based on ultra-lightweight channel attention was proposed,which effectively combined Conv-TasNet and channel attention.At the same time,a group of filters was added to the Conv-TasNet codec to improve the speech feature extraction ability of the network.This method can make convolutional neural network combine spatial information and channel information more effectively to improve the speech enhancement effect.Experiment shows that the proposed model can effectively improve the performance of speech enhancement when the model capacity is only in-creased by about 0.02%.
作者
洪依
孙成立
冷严
HONG Yi;SUN Chengli;LENG Yan(School of Information Engineering,Nanchang Hangkong University,Nanchang 330063,China;School of Physics and Electronic,Shandong Normal University,Jinan 250014,China)
出处
《智能科学与技术学报》
2021年第3期351-358,共8页
Chinese Journal of Intelligent Science and Technology
基金
国家自然科学基金资助项目(No.61861033)
江西省自然科学基金重点项目(No.20202ACBL202007)
山东省自然科学基金资助项目(No.ZR2020MF020)。
关键词
语音增强
端到端语音分离网络
通道注意力
speech enhancement
end-to-end speech separation network
channel attention