期刊文献+

改进卷积神经网络的单词级语音活体检测方法

Word-level voice liveness detection method basedon improved convolutional neural network
下载PDF
导出
摘要 为提高智能家居语音验证系统中重放语音的检测精度,提出了一种新型的单词级语音活体检测方法,采用轻量型卷积全局门控循环神经网络(light convolutional global gate recurrent neural network, LC-GGRNN)作为深度特征提取器,由支持向量机(support vector machine, SVM)执行真实和重放语音的分类,即LC-GGRNN-SVM框架。LC-GGRNN是在轻量型卷积神经网络的基础上引入了全局注意力机制和门控循环单元,前者关注提取特征的通道信息、空间信息以及通道与空间相互作用的信息,后者学习深度特征的长期相关性。提取POCO(pop noise corpus)数据集中音频文件的3种声学特征分别用于模型训练、验证和测试。结果表明,提取的伽马通频率倒谱系数声学特征在所提方法上检测效果最好,准确率、等错误率分别为85.72%、14.28%,错误接受率和错误拒绝率之和为28.59%,所提方法在POCO上的语音活体检测还具有性别依赖性。此外,所提方法对句子级重放语音检测也具有较好的泛化性。 In order to improve the detection accuracy of replay voice in the smart home voice verification system,a new word-level voice liveness detection method is proposed,that is,a light convolutional global gate recurrent neural network(LC-GGRNN)is used as a deep feature extractor,real and replay voice classification is performed by the support vector machine(SVM),that is framework of LC-GGRNN-SVM.In particular,a global attention mechanism and a gated recurrent unit are introduced into LC-GGRNN based on the light convolutional neural network.The former is to focus on the channel information,spatial information,and the interaction information between channel and space about extracted features,and the latter is to learn the long-term correlation of deep features.Three acoustic features extracted from audio files in the POCO(pop noise corpus)dataset are used for model training,validation,and testing.The results show that the extracted acoustic features of Gammatone frequency cepstral coefficients have the best detection effect on the proposed method.The accuracy and equal error rates are 85.72%and 14.28%,respectively,and the sum of the false acceptance rate and the false rejection rate is 28.59%.It can also be proved that voice liveness detection of the proposed method on POCO is gender-dependent.In addition,the proposed method also has good generalization for sentence-level replay voice detection.
作者 李志刚 宋晓婷 郭琪美 孙晓川 LI Zhigang;SONG Xiaoting;GUO Qimei;SUN Xiaochuan(College of Artificial Intelligence,North China University of Science and Technology,Tangshan 063210,P.R.China;Hebei Key Laboratory of Industrial Intelligent Perception,Tangshan 063210,P.R.China)
出处 《重庆邮电大学学报(自然科学版)》 CSCD 北大核心 2024年第1期39-48,共10页 Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基金 河北省高等学校科学技术研究项目(ZD2021088) 国家重点研发计划项目(2017YFE0135700)~~。
关键词 语音活体检测 声学特征 气爆杂音 轻量型卷积神经网络 支持向量机(SVM) POCO数据集 voice liveness detection acoustic features pop noise light convolutional neural network support vector machine(SVM) pop noise corpus(POCO)dataset
  • 相关文献

参考文献2

二级参考文献13

共引文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部