期刊文献+

i-TDNN:一种基于TDNN改进的含噪声纹识别方法 被引量:2

i-TDNN:An improved noise speaker recognition method based on TDNN
下载PDF
导出
摘要 针对声纹识别任务在含噪背景下鲁棒性欠佳的问题,文章提出了一种基于TDNN改进的含噪声纹识别方法。该算法先提取说话人音频的梅尔频谱,利用自注意力机制(SE)使得网络更加聚焦于重要特征,引入残差连接(Res)修正梅尔频谱与输出层的特征损失信息,一定程度缓解神经网络退化的问题,使用多层特征聚合(MFA)密集连接输出特征,生成关注统计池的特征,最终生成一种强鲁棒性的声纹特征。在AISHELL-ASR0009含噪数据集进行实验表明:与Base-TDNN相比,i-TDNN算法的识别准确率提升16.63%,验证了此算法在含噪背景下的鲁棒性。 To solve the problem that voice print recognition is not robust under background noise,this paper proposes an end-to-end Speaker Vector based on TDNN.Firstly,the algorithm extracts the Mahr spectrum of the speaker audio,and corrects the fea­ture loss information of the Mahr spectrum and the output layer with the residuals connection(Res).Secondly,the seif-attention mechanism is introduced to make the network focus more on the important features and to some extent alleviate the problem of neural network degradation.Multi-layer feature aggregation(MFA)is used to intensively connect the output features.??Gener­ate features that focus on the statistical pool,and finally generate a robust voicing vector.??Experiments on Aishell-1 dataset with noise show that compared with TDNN-base,this Speaker-Vector improves by 16.63%,thus verifying the effectiveness of this algorithm in the background of noise.
作者 伍雄 陈为真 WU Xiong;CHEN Weizhen(School of Electrical and Electronic Engineering of Wuhan Polytechnic University,Wuhan 430048,China)
出处 《长江信息通信》 2023年第2期27-30,共4页 Changjiang Information & Communications
基金 湖北省教育厅科技项目(B2020061)。
关键词 声纹识别 时延神经网络 自注意力机制 残差连接 多层特征聚合 Voiceprint Recognition TDNN Residual Connection Self Attention Multilayer Feature Aggregation
  • 相关文献

参考文献7

二级参考文献22

共引文献28

同被引文献5

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部