摘要
针对相位变换加权的广义互相关时延估计(GCC-PHAT)声源定位方法在低信噪比与高混响条件下定位精度较低的问题,提出一种基于SincNet神经网络和GCC-PHAT协同工作的室内声源定位算法。以LibriSpeech语音数据集作为声源输入,采用Sinc函数作为滤波器构建SincNet主干网络结构,能够有效提取声源语音特征;将特征输入到GCC-PHAT模块进行相关性分析与特征降维;再通过多层感知网络(MLP)进一步提取高级特征,输出时延误差分类结果。实验结果表明,相对于SCOT/PHAT联合加权、卷积神经网络(CNN)、深层全连接后向传播神经网络(D-BPNN)等先进的声源定位算法,该算法具备更强的抗混响性能,且在不同信噪比和混响强度下,该算法的定位精度显著高于GCC-PHAT,SincNet提取的特征能有效增强时延估计的鲁棒性。
A indoor sound source localization algorithm based on the cooperation of SincNet neural network and GCC-PHAT is proposed to address the problem of low positioning accuracy of the phase transformation weighted generalized cross correlation time delay estimation(GCC-PHAT)sound source localization method under low signal-to-noise ratio and high reverberation conditions.Using the LibriSpeech speech dataset as the sound source input and using the Sinc function as a filter to construct the SincNet backbone network structure can effectively extract the speech features of the sound source;Input features into the GCC-PHAT module for correlation analysis and feature dimensionality reduction;Then,advanced features are further extracted through a multi-layer perception network(MLP),and the classification results of delay errors are output.The experimental results show that compared to advanced sound source localization algorithms such as SCOT/PHAT joint weighting,convolutional neural network(CNN),deep fully connected backpropagation neural network(D-BPNN),this algorithm has stronger anti reverberation performance,and its localization accuracy is significantly higher than GCC-PHAT under different signal-to-noise ratios and reverberation intensities.The features extracted by SincNet can effectively enhance the robustness of time delay estimation.
作者
卢炽华
薛齐凡
刘志恩
朱亚伟
彭文杰
李放
LU Chi-hua;UE Qi-fan;LIU Zhi-en;ZHU Ya-wei;PENG Wen-jie;LI Fang(School of Automotive Engineering,Wuhan University of Technology,Wuhan 430070,China;Hubei Provincial Key Laboratory of Modern Auto Parts Technology,Wuhan University of Technology,Wuhan 430070,China)
出处
《武汉理工大学学报》
CAS
2023年第10期127-134,共8页
Journal of Wuhan University of Technology
基金
国家自然科学基金(52175111)
湖北省重点研发计划(2021BAA177)。