期刊文献+

基于均值特征和改进深度神经网络的说话人识别算法 被引量:2

Speaker recognition based on mean feature and improved deep neural network
下载PDF
导出
摘要 为提高神经网络在说话人识别应用中的识别性能,提出基于高斯增值矩阵特征和改进深度卷积神经网络的说话人识别算法。算法首先通过最大后验概率提取基于梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)特征的高斯均值矩阵,并对特征进行噪声适应性补偿,以增强信号的帧间关联和说话人特征信息,然后采用改进的深度卷积神经网络进一步对准帧间信息,以提高说话人识别特征对背景噪声的适应性。实验结果表明,相比于高斯混合模型-通用背景模型等识别框架及传统MFCC等特征,该算法可取得更高的识别准确率和最小的识别均方误差。 In order to improve the recognition performance,a speaker recognition algorithm based on Gaussian valueadded matrix features and improved deep convolutional neural network is proposed.In the algorithm,the adaptive Gaussian mean matrix based on Mel frequency cepstrum coefficient(MFCC)features is first extracted by the maximum posterior probability,and the noise adaptive compensation for features is performed to enhance interframe correlation and speaker feature information.Then,an improved deep convolutional neural network is used to further align the interframe information to improve the feature learning for speaker recognition and the adaptability to the back-ground noise environment.The experimental results show that,compared with Gaussian mixture model-general background model(GMM-UBM)framework and traditional MFCC features,the algorithm proposed in this paper achieves the best recognition accuracy and the least recognition mean square error.
作者 罗春梅 张风雷 LUO Chunmei;ZHANG Fenglei(School of Chemical and Mechanical Engineering,Eastern Liaoning University,Dandong 118000,Liaoning,China)
出处 《声学技术》 CSCD 北大核心 2021年第4期503-507,共5页 Technical Acoustics
基金 辽宁省教育厅科学研究项目(LNSJYT201904)。
关键词 说话人识别 梅尔频率倒谱系数(MFCC) 深度卷积神经网络 高斯均值矩阵 speaker recognition Mel frequency cepstrum coefficient(MFCC) deep convolutional neural network Gaussian mean matrix
  • 相关文献

参考文献8

二级参考文献35

共引文献74

同被引文献15

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部