期刊文献+

基于交叉注意力机制的波束形成后置滤波网络

Beamforming post-filter networks based on cross-attention mechanism
下载PDF
导出
摘要 针对经典后置滤波器存在的对非平稳噪声抑制效果较差且存在目标语音失真的问题,提出一种基于交叉注意力机制的后置滤波网络,使用基于门控循环单元的编解码器作为网络框架,并在编解码器组之间添加残差连接;使用基于伽马通域的波束输出信号与噪声参考信号功率谱的子带增益作为双特征输入;使用特征交叉的多头归一化点积注意力捕获序列输入的长距离依赖信息并进行特征融合。实验结果表明,该算法在不同信扰比和噪声条件下的语音质量和可懂度指标均优于基线系统,具有较强的鲁棒性;在对非平稳噪声具有较好抑制效果的同时,能最小化目标语音的失真;且相较端到端的深度学习方法,具有轻量化和低时延的特点,能满足实际工程应用的需求。 Aiming at problems of poor inhibition of non-stationary noise and target speech distortion in classical post-filters,this paper proposed a post-filter network based on cross-attention mechanism,which used gated recurrent unit based codec groups adding skip connections as the network framework.It used the subband gains of the beamforming output and noise refe-rence power spectrums based on GammaTone domain as the dual features,and used the crossed multi-head scaled dot-product attention to integrate the features and to capture the long-distance dependency of input sequences.The experimental results show that the proposed algorithm is superior to baselines in speech quality and intelligibility under different signal-to-interfe-rence ratios and noise conditions,and has strong robustness.It can not only reduce the non-stationary noise,but also minimize the distortion of the target speech.Compared with the deep learning based end-to-end methods,it is lightweight and has low time delay,which meets the needs of practical engineering applications.
作者 刘卓 付中华 Liu Zhuo;Fu Zhonghua(Xi’an Iflytek Super-Brain Information Technology Co.,Ltd.,Xi’an 710076,China;School of Computer Science,Northwestern Polytechnical University,Xi’an 710129,China)
出处 《计算机应用研究》 CSCD 北大核心 2022年第5期1444-1448,共5页 Application Research of Computers
基金 科技创新2030-“新一代人工智能”重大项目(2018AAA0103100)。
关键词 波束形成 后置滤波 交叉注意力机制 编解码器 伽马通域 特征融合 beamforming post-filter cross-attention encoder-decoder GammaTone domain feature integration
  • 相关文献

参考文献4

二级参考文献2

共引文献120

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部