摘要
本文研究图频域内的多通道语音增强,利用图信号处理理论(GSP)构建一种时间-空间维度的联合图拓扑结构,在此基础上设计增强算法进行多通道语音消噪。具体而言,基于输入阵列某个麦克风输入帧间语音顶点信号的时间相关关系,构造时间维度上的一种图拓扑结构;同时针对多通道含噪语音,根据各通道接收信号的空间相关关系,构造空间维度上的一种图拓扑结构。基于时间和空间二种图拓扑构成的联合图拓扑结构,采用图频域内的最小方差无失真响应(MVDR)增强算法,进行多通道语音增强。仿真实验结果表明,在平均客观语音质量评估(PESQ)得分和平均拓展短时客观可懂度(ESTOI)评价指标下,本文所提出的基于联合图拓扑结构的MVDR波束形成(JG-MVDR)方法都优于常规图MVDR波束形成(GMVDR)方法和基于复高斯混合模型的MVDR波束形成(CGMM-MVDR)方法。
In this paper,multichannel speech enhancement in the graph frequency domain is investigated,and a joint graph topology in the spatial-temporal dimension is constructed using graph signal processing(GSP)theory,based on which enhancement algorithms are designed for multichannel speech denoising.Specifically,a temporal graph topology is constructed based on the temporal correlation of speech vertex signals between the input frames of a microphone of the input array;Meanwhile,a spatial graph topology based on the spatial correlation of received signals in each channel is built for multi-channel noisy speech.Based on a joint graph topology composed of temporal and spatial bipartite graph topologies,a joint graph topology-based minimum variance distortionless response(MVDR)enhancement algorithm in the graph frequency domain is used to perform multichannel speech enhancement.Numerical simulation results show that the proposed joint graph topology-based MVDR(JG-MVDR)beamforming method outperforms both the regular graph-based MVDR(GMVDR)beamforming method and the complex Gaussian mixture model based MVDR(CGMM-MVDR)beamforming method in terms of the average perceptual evaluation of speech quality(PESQ)and the average extended short-time objective intelligibility(ESTOI).
作者
杨洋
郭海燕
王婷婷
张鹏程
杨震
YANG Yang;GUO Haiyan;WANG Tingting;ZHANG Pengcheng;YANG Zhen(College of Communication&Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing,Jiangsu 210003,China;National Local Joint Engineering Research Center for Communications and Network Technology Nanjing University of Posts and Telecommunications,Nanjing,Jiangsu 210003,China)
出处
《信号处理》
CSCD
北大核心
2023年第3期540-549,共10页
Journal of Signal Processing
基金
国家自然科学基金资助项目(62071242)
江苏省科研与实践创新项目(SJCX20_0245)。
关键词
语音增强
图信号处理
多通道
最小方差无失真响应
speech enhancement
graph signal processing
multichannel
minimum variance distortionless response