摘要
为改善在噪声、混响及声源移动情况下传统到达方向(direction of arrival,DOA)估计方法的性能,该文提出一种基于Kalman滤波与频率聚焦的单声源DOA实时估计与跟踪方法。该方法由去噪、去混响和DOA估计3个步骤构成。其中:去噪与去混响步骤的目标函数分别由最小化去噪信号误差和多通道线性预测系数误差建立,并分别通过Kalman滤波求解;DOA估计步骤通过基于频率聚焦的导向响应功率实现。该文所提方法建立在传播矩阵集成去混响与去噪步骤的基础上,通过波束形成获得的期望信号的先验估计,DOA估计步骤被进一步集成,从而促进3个步骤间的因果有序迭代。实验结果表明:与参考方法相比,该文所提方法的DOA估计与跟踪性能更优。
[Objective] Estimation of direction of arrival(DOA) is critical in spatial audio coding,speech enhancement,sound field synthesis,and sound source imaging.Commonly used signal model-based DOA estimation methods,such as the multiple signal classification method,can effectively estimate DOA information in noise-free and anechoic scenarios.However,real-world environments always have noise and reverberation,particularly in far-field speech communication scenarios characterized by low signal-to-noise ratios and strong reverberation.Furthermore,the sound source may be in motion.These factors considerably impair the performance of DOA estimation methods based on signal models.To address this issue,this paper introduces a real-time estimation and tracking method for the DOA of a single sound source,using Kalman filtering and frequency focusing.[Methods] The proposed method consists of three procedures:denoising,dereverberation,and DOA estimation.With regard to the denoising procedure,an objective optimization function to minimize the error of the denoised signal is established.This function is solved using a Kalman filter,which leads to obtaining the denoised signal through Kalman gain-based posterior estimation.For the dereverberation procedure,based on the autoregressive coefficients of the late reverberation components,an objective optimization function to minimize the error of the multichannel linear prediction(MCLP) coefficients is established.This function is also solved through another Kalman filter to obtain the MCLP coefficients.The DOA estimation procedure is implemented by using a frequency focusing based steered response power(FF-SRP) method,which can circumvent signal component diffusion within subspace decomposition.In particular,a structure that effectively intertwines these three procedures,enhancing the contribution of denoising and dereverberation results to DOA estimation.In this structure,a propagation matrix is utilized to integrate the denoising and dereverberation procedures,creating a causative iteration between them.Subsequently,a minimum variance distortionless response(MVDR) beamforming method is used to replace the multichannel Wiener filtering method.This is to obtain a prior estimation of the covariance matrix of the target signal.The MVDR beamforming method offers two advantages:it reduces the distortion of the target signal and integrates the DOA estimation procedure with the denoising procedure,thereby promoting a causal and orderly iteration among the three procedures.[Results] Experiments were conducted using a microphone array signal simulator and the TIMIT corpus.The mean absolute error(MAE) of the estimated DOA,along with the DOA track of the moving speaker,served as the evaluation measures.Experimental results revealed several key findings:(1) As RT_(60) increased,the MAE of all methods increased,clearly demonstrating that reverberation significantly affects DOA estimation performance.(2) Compared with the reference methods,the proposed method consistently delivered the lowest MAE values under different RT_(60)s and SNRs.This suggests that the proposed method has higher accuracy in DOA estimation.(3) In terms of DOA trajectory,the proposed method again outperformed the reference methods by producing the smallest error.This indicates that the proposed method has better performance in DOA tracking.[Conclusions] By integrating denoising,dereverberation,and DOA estimation through a causal and recursive iteration structure,the performance of DOA estimation and tracking can be significantly enhanced.The proposed method effectively mitigates the detrimental impact of noise and reverberation on DOA estimation and tracking accuracy in single sound source scenarios.
作者
周静
鲍长春
段海威
ZHOU Jing;BAO Changchun;DUAN Haiwei(Institute of Speech and Audio Signal Processing,Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China)
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2024年第11期1902-1910,共9页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金项目(61831019)。
关键词
到达方向估计
多通道线性预测
KALMAN滤波
频率聚焦
去混响
direction of arrival estimation
multichannel linear prediction
Kalman filtering
frequency focusing
dereverberation